Pre-trained CNNs and Transfer Learning


By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Table of Contents


1. Pre-trained Models

In [ ]:
from IPython.display import YouTubeVideo
YouTubeVideo('7JcSo0jCLdE?si=d530KtZ2bu7pNTxe&start=23', width = "560", height = "315")
Out[ ]:

1.1. ImageNet

  • Human performance = 5.1%



1.2. Pre-trained CNN Models

LeNet

  • CNN = Convolutional Neural Networks = ConvNet
  • LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition.
  • All are still the basic components of modern ConvNets!





AlexNet

  • Simplified version of Krizhevsky, Alex, Sutskever, and Hinton. "Imagenet classification with deep convolutional neural networks." NIPS 2012

  • LeNet-style backbone, plus:

    • ReLU [Nair & Hinton 2010]
      • RevoLUtion of deep learning
      • Accelerate training
    • Dropout [Hinton et al 2012]
      • In-network ensembling
      • Reduce overfitting
    • Data augmentation
      • Label-preserving transformation
      • Reduce overfitting





VGG-16/19

  • Simonyan, Karen, and Zisserman. "Very deep convolutional networks for large-scale image recognition." (2014)

  • Simply “Very Deep”!

    • Modularized design
      • 3x3 Conv as the module
      • Stack the same module
      • Same computation for each module
    • Stage-wise training
      • VGG-11 → VGG-13 → VGG-16
      • We need a better initialization…





GoogleNet/Inception

  • Multiple branches
    • e.g., 1x1, 3x3, 5x5, pool
  • Shortcuts
    • stand-alone 1x1, merged by concat.
  • Bottleneck
    • Reduce dim by 1x1 before expensive 3x3/5x5 conv





ResNet

  • He, Kaiming, et al. "Deep residual learning for image recognition." CVPR. 2016.





  • Skip Connection and Residual Net

    • A direct connection between 2 non-consecutive layers

    • No gradient vanishing

    • Parameters are optimized to learn a residual, that is the difference between the value before the block and the one needed after.

    • A skip connection is a connection that bypasses at least one layer.

    • Here, it is often used to transfer local information by concatenating or summing feature maps from the downsampling path with feature maps from the upsampling path.

    • Merging features from various resolution levels helps combining context information with spatial information.

In [ ]:
def residual_net(x):
    conv1 = tf.keras.layers.Conv2D(filters = 32,
                                   kernel_size = (3, 3),
                                   padding = "SAME",
                                   activation = 'relu')(x)

    conv2 = tf.keras.layers.Conv2D(filters = 32,
                                   kernel_size = (3, 3),
                                   padding = "SAME",
                                   activation = 'relu')(conv1)

    maxp2 = tf.keras.layers.MaxPool2D(pool_size = (2, 2),
                                      strides = 2)(conv2 + x)

    flat = tf.keras.layers.Flatten()(maxp2)

    hidden = tf.keras.layers.Dense(units = n_hidden,
                                   activation='relu')(flat)

    output = tf.keras.layers.Dense(units = n_output)(hidden)

    return output

DenseNets



U-Net





  • The U-Net owes its name to its symmetric shape

  • The U-Net architecture is built upon the Fully Convolutional Network and modified in a way that it yields better segmentation in medical imaging.

  • Compared to FCN-8, the two main differences are

    • U-net is symmetric and
    • the skip connections between the downsampling path and the upsampling path apply a concatenation operator instead of a sum.
  • These skip connections intend to provide local information to the global information while upsampling. Because of its symmetry, the network has a large number of feature maps in the upsampling path, which allows to transfer information.

1.3. Load Pre-trained Models

List of Available Models

  • VGG16
  • VGG19
  • ResNet
  • GoogLeNet/Inception
  • DenseNet
  • MobileNet
In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import cv2

%matplotlib inline

Model Selection

In [ ]:
# model_type = tf.keras.applications.densenet
# model_type = tf.keras.applications.inception_resnet_v2
# model_type = tf.keras.applications.inception_v3
model_type = tf.keras.applications.mobilenet
# model_type = tf.keras.applications.mobilenet_v2
# model_type = tf.keras.applications.nasnet
# model_type = tf.keras.applications.resnet50
# model_type = tf.keras.applications.vgg16
# model_type = tf.keras.applications.vgg19

Model Summary

In [ ]:
model = model_type.MobileNet() # Change Model (hint : use capital name)

model.summary()
Model: "mobilenet_1.00_224"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 conv1 (Conv2D)              (None, 112, 112, 32)      864       
                                                                 
 conv1_bn (BatchNormalizati  (None, 112, 112, 32)      128       
 on)                                                             
                                                                 
 conv1_relu (ReLU)           (None, 112, 112, 32)      0         
                                                                 
 conv_dw_1 (DepthwiseConv2D  (None, 112, 112, 32)      288       
 )                                                               
                                                                 
 conv_dw_1_bn (BatchNormali  (None, 112, 112, 32)      128       
 zation)                                                         
                                                                 
 conv_dw_1_relu (ReLU)       (None, 112, 112, 32)      0         
                                                                 
 conv_pw_1 (Conv2D)          (None, 112, 112, 64)      2048      
                                                                 
 conv_pw_1_bn (BatchNormali  (None, 112, 112, 64)      256       
 zation)                                                         
                                                                 
 conv_pw_1_relu (ReLU)       (None, 112, 112, 64)      0         
                                                                 
 conv_pad_2 (ZeroPadding2D)  (None, 113, 113, 64)      0         
                                                                 
 conv_dw_2 (DepthwiseConv2D  (None, 56, 56, 64)        576       
 )                                                               
                                                                 
 conv_dw_2_bn (BatchNormali  (None, 56, 56, 64)        256       
 zation)                                                         
                                                                 
 conv_dw_2_relu (ReLU)       (None, 56, 56, 64)        0         
                                                                 
 conv_pw_2 (Conv2D)          (None, 56, 56, 128)       8192      
                                                                 
 conv_pw_2_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_pw_2_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_dw_3 (DepthwiseConv2D  (None, 56, 56, 128)       1152      
 )                                                               
                                                                 
 conv_dw_3_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_dw_3_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_pw_3 (Conv2D)          (None, 56, 56, 128)       16384     
                                                                 
 conv_pw_3_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_pw_3_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_pad_4 (ZeroPadding2D)  (None, 57, 57, 128)       0         
                                                                 
 conv_dw_4 (DepthwiseConv2D  (None, 28, 28, 128)       1152      
 )                                                               
                                                                 
 conv_dw_4_bn (BatchNormali  (None, 28, 28, 128)       512       
 zation)                                                         
                                                                 
 conv_dw_4_relu (ReLU)       (None, 28, 28, 128)       0         
                                                                 
 conv_pw_4 (Conv2D)          (None, 28, 28, 256)       32768     
                                                                 
 conv_pw_4_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_pw_4_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_dw_5 (DepthwiseConv2D  (None, 28, 28, 256)       2304      
 )                                                               
                                                                 
 conv_dw_5_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_dw_5_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_pw_5 (Conv2D)          (None, 28, 28, 256)       65536     
                                                                 
 conv_pw_5_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_pw_5_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_pad_6 (ZeroPadding2D)  (None, 29, 29, 256)       0         
                                                                 
 conv_dw_6 (DepthwiseConv2D  (None, 14, 14, 256)       2304      
 )                                                               
                                                                 
 conv_dw_6_bn (BatchNormali  (None, 14, 14, 256)       1024      
 zation)                                                         
                                                                 
 conv_dw_6_relu (ReLU)       (None, 14, 14, 256)       0         
                                                                 
 conv_pw_6 (Conv2D)          (None, 14, 14, 512)       131072    
                                                                 
 conv_pw_6_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_6_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_7 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_7_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_7_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_7 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_7_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_7_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_8 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_8_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_8_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_8 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_8_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_8_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_9 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_9_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_9_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_9 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_9_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_9_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_10 (DepthwiseConv2  (None, 14, 14, 512)       4608      
 D)                                                              
                                                                 
 conv_dw_10_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_dw_10_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pw_10 (Conv2D)         (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_10_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_pw_10_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_dw_11 (DepthwiseConv2  (None, 14, 14, 512)       4608      
 D)                                                              
                                                                 
 conv_dw_11_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_dw_11_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pw_11 (Conv2D)         (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_11_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_pw_11_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pad_12 (ZeroPadding2D  (None, 15, 15, 512)       0         
 )                                                               
                                                                 
 conv_dw_12 (DepthwiseConv2  (None, 7, 7, 512)         4608      
 D)                                                              
                                                                 
 conv_dw_12_bn (BatchNormal  (None, 7, 7, 512)         2048      
 ization)                                                        
                                                                 
 conv_dw_12_relu (ReLU)      (None, 7, 7, 512)         0         
                                                                 
 conv_pw_12 (Conv2D)         (None, 7, 7, 1024)        524288    
                                                                 
 conv_pw_12_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_pw_12_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 conv_dw_13 (DepthwiseConv2  (None, 7, 7, 1024)        9216      
 D)                                                              
                                                                 
 conv_dw_13_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_dw_13_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 conv_pw_13 (Conv2D)         (None, 7, 7, 1024)        1048576   
                                                                 
 conv_pw_13_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_pw_13_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 global_average_pooling2d (  (None, 1, 1, 1024)        0         
 GlobalAveragePooling2D)                                         
                                                                 
 dropout (Dropout)           (None, 1, 1, 1024)        0         
                                                                 
 conv_preds (Conv2D)         (None, 1, 1, 1000)        1025000   
                                                                 
 reshape_2 (Reshape)         (None, 1000)              0         
                                                                 
 predictions (Activation)    (None, 1000)              0         
                                                                 
=================================================================
Total params: 4253864 (16.23 MB)
Trainable params: 4231976 (16.14 MB)
Non-trainable params: 21888 (85.50 KB)
_________________________________________________________________

Example of Pre-trained Model

In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
In [ ]:
# img = cv2.imread('/content/drive/MyDrive/DL/DL_data/ILSVRC2017_test_00000005.JPEG')
img = cv2.imread('/content/drive/MyDrive/DL/DL_data/ILSVRC2017_test_00005381.JPEG')

print(img.shape)

plt.figure(figsize = (6, 6))
plt.imshow(img)
plt.axis('off')
plt.show()
(500, 333, 3)
In [ ]:
resized_img = cv2.resize(img, (224, 224)).reshape(1, 224, 224, 3)

plt.figure(figsize = (6, 6))
plt.imshow(resized_img[0])
plt.axis('off')
plt.show()
In [ ]:
input_img = model_type.preprocess_input(resized_img)

pred = model.predict(input_img, verbose = 0)
label = model_type.decode_predictions(pred)[0]

print('%s (%.2f%%)\n' % (label[0][1], label[0][2]*100))
print('%s (%.2f%%)\n' % (label[1][1], label[1][2]*100))
print('%s (%.2f%%)\n' % (label[2][1], label[2][2]*100))
print('%s (%.2f%%)\n' % (label[3][1], label[3][2]*100))
print('%s (%.2f%%)\n' % (label[4][1], label[4][2]*100))
soccer_ball (92.07%)

knee_pad (2.68%)

football_helmet (2.44%)

ballplayer (1.17%)

tennis_ball (0.49%)

2. Transfer Learning

In [ ]:
from IPython.display import YouTubeVideo
YouTubeVideo('7JcSo0jCLdE?si=7IuLwj5L5lxk6lxI&start=2003', width = "560", height = "315")
Out[ ]:

2.1. Pre-trained Model (VGG16)

  • Training a model on ImageNet from scratch takes days or weeks.

  • Many models trained on ImageNet and their weights are publicly available!

  • Transfer learning

    • Use pre-trained weights, remove last layers to compute representations of images
    • The network is used as a generic feature extractor
    • Train a classification model from these features on a new classification task
    • Pre- trained models can extract more general image features that can help identify edges, textures, shapes, and object composition
    • Better than handcrafted feature extraction on natural images






Import Library

In [ ]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

Load Data

In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
# Change file paths if necessary

train_imgs = np.load('/content/drive/MyDrive/DL/DL_data/tranfer_learning_train_images.npy')
train_labels = np.load('/content/drive/MyDrive/DL/DL_data/tranfer_learning_train_labels.npy')

test_imgs = np.load('/content/drive/MyDrive/DL/DL_data/tranfer_learning_test_images.npy')
test_labels = np.load('/content/drive/MyDrive/DL/DL_data/tranfer_learning_test_labels.npy')

print(train_imgs.shape)
print(train_labels[0]) # one-hot-encoded 5 classes

# remove one-hot-encoding
train_labels = np.argmax(train_labels, axis = 1)
test_labels = np.argmax(test_labels, axis = 1)
(65, 224, 224, 3)
[1. 0. 0. 0. 0.]
In [ ]:
n_train = train_imgs.shape[0]
n_test = test_imgs.shape[0]

# very small dataset
print(n_train)
print(n_test)
65
9
In [ ]:
Dict = ['Hat','Cube','Card','Torch','Screw']

plt.figure(figsize = (8, 6))
plt.subplot(2,3,1)
plt.imshow(train_imgs[1])
plt.title("Label: {}".format(Dict[train_labels[1]]))
plt.axis('off')
plt.subplot(2,3,2)
plt.imshow(train_imgs[2])
plt.title("Label: {}".format(Dict[train_labels[2]]))
plt.axis('off')
plt.subplot(2,3,3)
plt.imshow(train_imgs[3])
plt.title("Label: {}".format(Dict[train_labels[3]]))
plt.axis('off')
plt.subplot(2,3,4)
plt.imshow(train_imgs[18])
plt.title("Label: {}".format(Dict[train_labels[18]]))
plt.axis('off')
plt.subplot(2,3,5)
plt.imshow(train_imgs[25])
plt.title("Label: {}".format(Dict[train_labels[25]]))
plt.axis('off')
plt.show()

Load VGG16 Model





In [ ]:
model_type = tf.keras.applications.vgg16
base_model = model_type.VGG16()
base_model.trainable = False
base_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
553467096/553467096 [==============================] - 27s 0us/step
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 fc1 (Dense)                 (None, 4096)              102764544 
                                                                 
 fc2 (Dense)                 (None, 4096)              16781312  
                                                                 
 predictions (Dense)         (None, 1000)              4097000   
                                                                 
=================================================================
Total params: 138357544 (527.79 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 138357544 (527.79 MB)
_________________________________________________________________

Testing for Target Data

In [ ]:
idx = np.random.randint(n_test)
pred = base_model.predict(test_imgs[idx].reshape(-1, 224, 224, 3), verbose = 0)
label = model_type.decode_predictions(pred)[0]

print('%s (%.2f%%)' % (label[0][1], label[0][2]*100))
print('%s (%.2f%%)' % (label[1][1], label[1][2]*100))
print('%s (%.2f%%)' % (label[2][1], label[2][2]*100))
print('%s (%.2f%%)' % (label[3][1], label[3][2]*100))
print('%s (%.2f%%)' % (label[4][1], label[4][2]*100))

plt.figure(figsize = (4, 4))
plt.imshow(test_imgs[idx])
plt.title("Label : {}".format(Dict[test_labels[idx]]))
plt.axis('off')
plt.show()
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
35363/35363 [==============================] - 0s 0us/step
mosquito_net (6.94%)
toilet_tissue (3.43%)
Band_Aid (1.53%)
envelope (1.46%)
shower_curtain (1.39%)

2.2. Transfer Learning

  • We assume that these model parameters contain the knowledge learned from the source data set and that this knowledge will be equally applicable to the target data set.
  • We will train the output layer from scratch, while the parameters of all remaining layers are fine tuned based on the parameters of the source model.
  • Or initialize all weights from pre-trained model, then train them with target data









Pre-trained Weights, Biases

In [ ]:
vgg16_weights = base_model.get_weights()

Build a Transfer Learning Model

In [ ]:
# replace new and trainable classifier layer
fc2_layer = base_model.layers[-2].output
output = tf.keras.layers.Dense(units = 5, activation = 'softmax')(fc2_layer)

# define new model
TL_model = tf.keras.Model(inputs = base_model.inputs, outputs = output)
In [ ]:
TL_model.summary()
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 fc1 (Dense)                 (None, 4096)              102764544 
                                                                 
 fc2 (Dense)                 (None, 4096)              16781312  
                                                                 
 dense (Dense)               (None, 5)                 20485     
                                                                 
=================================================================
Total params: 134281029 (512.24 MB)
Trainable params: 20485 (80.02 KB)
Non-trainable params: 134260544 (512.16 MB)
_________________________________________________________________

Define Loss and Optimizer

In [ ]:
TL_model.compile(optimizer = 'adam',
                 loss = 'sparse_categorical_crossentropy',
                 metrics = ['accuracy'])

Optimize

In [ ]:
TL_model.fit(train_imgs, train_labels, batch_size = 10, epochs = 10)
Epoch 1/10
7/7 [==============================] - 5s 233ms/step - loss: 1.8993 - accuracy: 0.2462
Epoch 2/10
7/7 [==============================] - 0s 53ms/step - loss: 1.6406 - accuracy: 0.4462
Epoch 3/10
7/7 [==============================] - 0s 54ms/step - loss: 1.2547 - accuracy: 0.4462
Epoch 4/10
7/7 [==============================] - 0s 56ms/step - loss: 0.9955 - accuracy: 0.6000
Epoch 5/10
7/7 [==============================] - 0s 56ms/step - loss: 0.7194 - accuracy: 0.8615
Epoch 6/10
7/7 [==============================] - 0s 55ms/step - loss: 0.6324 - accuracy: 0.8769
Epoch 7/10
7/7 [==============================] - 0s 54ms/step - loss: 0.6120 - accuracy: 0.8462
Epoch 8/10
7/7 [==============================] - 0s 55ms/step - loss: 0.4729 - accuracy: 0.9692
Epoch 9/10
7/7 [==============================] - 0s 54ms/step - loss: 0.4166 - accuracy: 0.9538
Epoch 10/10
7/7 [==============================] - 0s 53ms/step - loss: 0.3716 - accuracy: 0.9846
Out[ ]:
<keras.src.callbacks.History at 0x7aeb01ff3a30>

Test and Evaluate

In [ ]:
test_loss, test_acc = TL_model.evaluate(test_imgs, test_labels)
1/1 [==============================] - 2s 2s/step - loss: 0.2379 - accuracy: 1.0000
In [ ]:
test_x = test_imgs[np.random.choice(n_test, 1)]
pred = np.argmax(TL_model.predict(test_x, verbose = 0))

plt.figure(figsize = (4, 4))
plt.imshow(test_x.reshape(224, 224, 3))
plt.axis('off')
plt.show()

print('Prediction : {}'.format(Dict[pred]))
Prediction : Screw
In [ ]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')