Deep Learning for Mechanical Engineering

Homework 07

Due Monday, 11/06/2021, 4:00 PM

Prof. Seungchul Lee
Industrial AI Lab at KAIST
  • For your handwritten solutions, please scan or take a picture of them. Alternatively, you can write them in markdown if you prefer.

  • Only .ipynb files will be graded for your code.

    • Ensure that your NAME and student ID are included in your .ipynb files. ex) IljeokKim_20202467_HW07.ipynb
  • Compress all the files into a single .zip file.

    • In the .zip file's name, include your NAME and student ID. ex)
    • Submit this .zip file on KLMS
  • Do not submit a printed version of your code, as it will not be graded.

Problem 1: Load the dataset¶

We will create a convolutional neural network to classify images of berries, birds, dogs, and flowers. To get started, we need to download the dataset. This dataset will be utilized for both Problem 2 and Problem 3.

(1) Load the provided dataset.

In [ ]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
In [ ]:
from google.colab import drive
Mounted at /content/drive
In [ ]:
## your code here

train_image =
train_label =
test_image =
test_label =

(2) Visualize ten randomly selected images from the training dataset.

In [ ]:
## your code here

Problem 2: Transfer Learning¶

We will utilize the VGG16 architecture to train our dataset. As shown in the image below, the VGG16 architecture consists of 16 layer blocks with a substantial number of trainable parameters. Fortunately, deep learning libraries like TensorFlow, Keras, and PyTorch offer pre-trained models for ImageNet, sparing us from the need to design and train a model from the ground up.

(1) Create a VGG16 model using deep learning libraries, such as TensorFlow, Keras, or PyTorch.

In [ ]:
## your code here
Downloading data from
553467096/553467096 [==============================] - 7s 0us/step
Model: "vgg16"
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
 flatten (Flatten)           (None, 25088)             0         
 fc1 (Dense)                 (None, 4096)              102764544 
 fc2 (Dense)                 (None, 4096)              16781312  
 predictions (Dense)         (None, 1000)              4097000   
Total params: 138357544 (527.79 MB)
Trainable params: 138357544 (527.79 MB)
Non-trainable params: 0 (0.00 Byte)

(2) Revise the original VGG16 architecture. As shown in the image below, we will make modifications exclusively to the fully connected layer section. Additionally, given that we are using pre-trained parameters, the parameters of the feature extraction portion must remain fixed.

In [ ]:
## your code here
Model: "vgg16"
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
 flatten (Flatten)           (None, 25088)             0         
 fc1 (Dense)                 (None, 4096)              102764544 
 fc2 (Dense)                 (None, 4096)              16781312  
 predictions (Dense)         (None, 1000)              4097000   
Total params: 138357544 (527.79 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 138357544 (527.79 MB)
In [ ]:
## your code here
Model: "model"
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
 conv2d (Conv2D)             (None, 7, 7, 1024)        4719616   
 global_average_pooling2d (  (None, 1024)              0         
 dense (Dense)               (None, 4)                 4100      
Total params: 19438404 (74.15 MB)
Trainable params: 4723716 (18.02 MB)
Non-trainable params: 14714688 (56.13 MB)

(3) Train the modified VGG16 model.

In [ ]:
## your code here
Epoch 1/5
19/19 [==============================] - 47s 1s/step - loss: 1.7230 - accuracy: 0.5192
Epoch 2/5
19/19 [==============================] - 10s 515ms/step - loss: 0.4351 - accuracy: 0.8533
Epoch 3/5
19/19 [==============================] - 10s 530ms/step - loss: 0.3287 - accuracy: 0.8742
Epoch 4/5
19/19 [==============================] - 10s 542ms/step - loss: 0.2418 - accuracy: 0.9104
Epoch 5/5
19/19 [==============================] - 11s 551ms/step - loss: 0.2242 - accuracy: 0.9150
Out[ ]:
<keras.src.callbacks.History at 0x7d1270167550>

(4) Print your accuracy with the test dataset.

In [ ]:
## your code here
Accuracy: 90.75 %

Problem 3: Class Activation Maps¶

(1) Visualize the Class Activation Mapping (CAM) results as presented in the provided figure.

In [ ]:
## your code here
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 132ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 23ms/step