Machine Learning for Mechanical Engineering

Autoencoder

Instructor: Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

Problem 1

1) Make an autoencoder model where latent space is 10-D with the MNIST dataset.

Download files

In [6]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

mnist_train_images = np.load('./data_files/mnist_train_images_rev.npy')
mnist_train_labels = np.load('./data_files/mnist_train_labels_rev.npy')
mnist_test_images = np.load('./data_files/mnist_test_images_rev.npy')
mnist_test_labels = np.load('./data_files/mnist_test_labels_rev.npy')

train_labels = mnist_train_labels
train_imgs = mnist_train_images
test_labels = mnist_test_labels
test_imgs = mnist_test_images

## write your code here
#

2) Show 5 random reconstructed images.

In [7]:
## write your code here
#

3) Classify 10 digit classes by using MLP on latent space. Note that accurary above 65% is good enough.

In [8]:
from sklearn.neural_network import MLPClassifier

## write your code here
#

Problem 2

PCA (Principal Component Analysis)

  • PCA is one of the oldest and most widely used dimensionality reduction algorithms. Its idea is to reduce the dimensionality of a dataset, while preserving as much 'variability' as possible. (but you don't need to fully understand the PCA algorithm for this problem.)

Actually, autoencoder is very similar to the PCA algorithm in terms of 'dimension reduction'. So, in this problem, we are going to build up PCA with autoencoder. While PCA is a linear dimension reduction method, the autoencoder has non-linear activation functions. Therefore, autoencoder without non-linear activation functions can be considered as PCA.

Now, we have 3D data. Run the below cell to load and 3D plot them.

Data Download Link (same)

In [6]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
%matplotlib inline

data = np.load('./data_files/pca_autoencoder.npy')

fig = plt.figure(figsize = (10, 8))
ax = fig.add_subplot(111, projection = '3d')
ax.scatter(data[:,0], data[:,1], data[:,2])
plt.show()

X_train, X_test = data[:100], data[100:]

PCA result in 2D is shown as follows.

In [11]:
from sklearn.decomposition import PCA

pca = PCA(n_components = 2)
pca.fit(data)
result = pca.transform(data)

plt.figure(figsize = (8, 6))
plt.plot(result[:,0], result[:,1], 'o')
plt.axis('equal')
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.show()

(1) Design your linear autoencoder model using ANN. (freely design your own network structure)

In [ ]:
## Your code here
#

(2) After training your model, plot the data on latent space.

In [ ]:
## Your code here
#

Problem 3

We are going to make an autoencoder model using EMNIST handwritten alphabet images. The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28 $\times$ 28 pixel image format and dataset structure that directly matches the MNIST dataset

Let's load the dataset.

In [ ]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import accuracy_score
%matplotlib inline

x_total = np.load('./data_files/Alphabet_Handwritten_1000_images.npy')
y_total = np.load('./data_files/Alphabet_Handwritten_1000_labels.npy')

x_train, x_test, y_train, y_test = train_test_split(x_total, y_total, test_size = 0.15, random_state = 77)

print('Trainset shape:', x_train.shape, y_train.shape)
print('Testset shape:', x_test.shape, y_test.shape)

del x_total, y_total

We count the number of each character in trainset and plot the images in a descending order.

In [ ]:
count = {}

for i in range(26):
    count[i] = len(np.where(y_train == i)[0])
    print(chr(65 + i), count[i])

count = sorted(count.items(), reverse = True,  key = lambda item: item[1])
In [ ]:
plt.figure(figsize = (18, 12))
sp = 1

for i, _ in count:
    plt.subplot(5, 6, sp)
    r = np.random.choice(np.where(y_train == i)[0])
    plt.imshow(x_train[r].reshape(28, 28), 'gray_r')
    plt.title('Class: {}'.format(chr(65 + i)))
    plt.xticks([]); plt.yticks([])
    sp += 1
    
plt.show()

1) Make your own autoencoder model using the top 3 most frequent alphabets in trainset. Set the dimension of latent space is 2.

  • Dimension of latent space = 2
  • The number of encoder's hidden layers = 3, (500, 300, 200)
  • The number of decoder's hidden layers = 3, (200, 300, 500)
  • activation = tanh
  • solver = adam
In [ ]:
train_x = np.zeros((28*28))
train_y = []
test_x =  np.zeros((28*28))
test_y = []

for i in range(3):
    idx = count[i][0]
    train_x = np.vstack((train_x, x_train[np.where(y_train == idx)[0]]))
    for j in range(x_train[np.where(y_train == idx)[0]].shape[0]):
        train_y.append(i)
    test_x = np.vstack((test_x, x_test[np.where(y_test == idx)[0]]))
    for j in range(x_test[np.where(y_test == idx)[0]].shape[0]):
        test_y.append(i)

train_x = train_x[1:]/255
test_x = test_x[1:]/255
train_y = np.asarray(train_y)
test_y = np.asarray(test_y)
In [ ]:
## write your code here
#

3) Reconstruct random images of each character from test set. (the top 3 most frequent alphabets)

In [ ]:
# fill out the blank

idxs = 

# plotting
for idx in idxs:
    x_reconst = reg.predict(test_x[idx].reshape(-1,784))
    plt.figure(figsize = (10,8))
    plt.subplot(1,2,1)
    plt.imshow(test_x[idx].reshape(28,28), 'gray')
    plt.title('Imput Image', fontsize = 15)
    plt.xticks([])
    plt.yticks([])
    plt.subplot(1,2,2)
    plt.imshow(x_reconst.reshape(28,28), 'gray')
    plt.title('Reconstructed Image', fontsize = 15)
    plt.xticks([])
    plt.yticks([])
    plt.show()

4) Show the latent variables of the test set in 2D space.

In [ ]:
## Write your code here
#

# plotting
plt.figure(figsize = (10,10))
plt.scatter(test_latent[test_y == 0,0], test_latent[test_y == 0,1], label = 'A')
plt.scatter(test_latent[test_y == 1,0], test_latent[test_y == 1,1], label = 'Y')
plt.scatter(test_latent[test_y == 2,0], test_latent[test_y == 2,1], label = 'G')
plt.title('Latent Space', fontsize=15)
plt.xlabel('Z1', fontsize=15)
plt.ylabel('Z2', fontsize=15)
plt.legend(fontsize = 15)
plt.axis('equal')
plt.show()

5) Suppose we do not know the label of the test set. Do K-means clustring (with $k = 3$) on latent variables with the test set.

In [ ]:
from sklearn.cluster import KMeans

## write your code here
#


# plotting
plt.figure(figsize = (10,10))
plt.scatter(test_latent[:,0], test_latent[:,1])
plt.title('Latent Space', fontsize = 15)
plt.xlabel('Z1', fontsize = 15)
plt.ylabel('Z2', fontsize = 15)
plt.axis('equal')
plt.show()

plt.figure(figsize = (10,10))
plt.plot(test_latent[kmeans.labels_ == 0,0],test_latent[kmeans.labels_ == 0,1], 'b.', label = 'C0', markersize = 12)
plt.plot(test_latent[kmeans.labels_ == 1,0],test_latent[kmeans.labels_ == 1,1], 'g.', label = 'C1', markersize = 12)
plt.plot(test_latent[kmeans.labels_ == 2,0],test_latent[kmeans.labels_ == 2,1], 'r.', label = 'C2', markersize = 12)
plt.axis('equal')
plt.title('Latent Space', fontsize = 15)
plt.xlabel('Z1', fontsize = 15)
plt.ylabel('Z2', fontsize = 15)
plt.legend(fontsize = 12)
plt.show()

6) Generate the fake alphabets decoded from the centroids of each cluster. (The result may not be exactly the same with input image)

In [ ]:
## write your code here
#

# plotting
for latent in latents:
    reconst = decoder(latent)
    plt.figure(figsize = (16,7))
    plt.subplot(1,2,1)
    plt.plot(test_latent[kmeans.labels_ == 0,0],test_latent[kmeans.labels_ == 0,1], 'b.', label = 'C0')
    plt.plot(test_latent[kmeans.labels_ == 1,0],test_latent[kmeans.labels_ == 1,1], 'g.', label = 'C1')
    plt.plot(test_latent[kmeans.labels_ == 2,0],test_latent[kmeans.labels_ == 2,1], 'r.', label = 'C2')
    plt.scatter(latent[:,0], latent[:,1], c = 'k', marker = 'o', s = 200, label = 'new data')
    plt.title('Latent Space', fontsize = 15)
    plt.xlabel('Z1', fontsize = 15)
    plt.ylabel('Z2', fontsize = 15)
    plt.legend(loc = 2, fontsize = 12)
    plt.axis('equal')
    plt.subplot(1,2,2)
    plt.imshow(reconst.reshape(28,28), 'gray')
    plt.title('Generated Fake Image', fontsize = 15)
    plt.xticks([])
    plt.yticks([])
    plt.show()

Problem 4

The encoder part of an autoencoder is well known as a dimensionality reduction operator. This problem will ask you to implement the autoencoder algorithm for face data. The given data consists of 100 pictures of human faces with size of (50,40), we will apply an autoencoder to this dataset.

Download data from here.

In [ ]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

data = np.load(open('./data_files/pca_faces.npy', 'rb'))
print(data.shape)

(a) Plot one random face out of 100 pictures. You might want to run multiple times to see what kinds of faces are in the data set.

In [ ]:
## write your code here
#

# plotting
plt.figure(figsize = (10,8))
plt.imshow(sample, 'gray')
plt.axis('off')
plt.show()

(b) Apply the autoencoder to the dataset. Build your model with the following structure:

  • first encoder: 500

  • second encoder: 300

  • latent node: 8

  • first decoder: 300

  • second decoder: 500

  • activation = 'relu'

In [ ]:
train_face = data.reshape([100, 50*40])
print(train_face.shape)
In [ ]:
## write your code here for training
#
In [ ]:
## write your code here to plot the reconstructed images of random input image
## Do not worry too much about poor reconstruction performance
#

# plotting
plt.figure(figsize = (15,12))
plt.subplot(1,2,1)
plt.imshow(test.reshape(50,40), 'gray')
plt.title('Input Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.subplot(1,2,2)
plt.imshow(recons.reshape(50,40), 'gray')
plt.title('Reconstructed Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.show()

(c) Reconstruct the face who wears a crown by your autoencoder and discuss the result of the reconstructed face. Do not worry too much about poor reconstruction performance.

In [ ]:
plt.figure(figsize = (10,8))
plt.imshow(data[94], 'gray')
plt.axis('off')
plt.show()
In [ ]:
## write your code here
#

# plotting
plt.figure(figsize = (15,12))
plt.subplot(1,2,1)
plt.imshow(test_face.reshape(50,40), 'gray')
plt.title('Input Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.subplot(1,2,2)
plt.imshow(recons.reshape(50,40), 'gray')
plt.title('Reconstructed Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.show()

(d) Reconstruct the face who wears sunglasses by your autoencoder and discuss the result of the reconstructed face. Do not worry too much about poor reconstruction performance.

In [ ]:
plt.figure(figsize = (10,8))
plt.imshow(data[28], 'gray')
plt.axis('off')
plt.show()
In [ ]:
## write your code here
## Do not worry too much about poor reconstruction performance
#

# plotting
plt.figure(figsize = (15,12))
plt.subplot(1,2,1)
plt.imshow(test_face.reshape(50,40), 'gray')
plt.title('Input Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.subplot(1,2,2)
plt.imshow(recons.reshape(50,40), 'gray')
plt.title('Reconstructed Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.show()