1. ImageNet¶

Human performance = 5.1%

1.1. LeNet¶

CNN = Convolutional Neural Networks = ConvNet
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition.
All are still the basic components of modern ConvNets!

1.2. AlexNet¶

Simplified version of Krizhevsky, Alex, Sutskever, and Hinton. "Imagenet classification with deep convolutional neural networks." NIPS 2012
LeNet-style backbone, plus:
- ReLU [Nair & Hinton 2010]
  - RevoLUtion of deep learning
  - Accelerate training
- Dropout [Hinton et al 2012]
  - In-network ensembling
  - Reduce overfitting
- Data augmentation
  - Label-preserving transformation
  - Reduce overfitting

1.3. VGG-16/19¶

Simonyan, Karen, and Zisserman. "Very deep convolutional networks for large-scale image recognition." (2014)
Simply “Very Deep”!
- Modularized design
  - 3x3 Conv as the module
  - Stack the same module
  - Same computation for each module
- Stage-wise training
  - VGG-11 → VGG-13 → VGG-16
  - We need a better initialization…

1.4. GoogleNet/Inception¶

Multiple branches
- e.g., 1x1, 3x3, 5x5, pool
Shortcuts
- stand-alone 1x1, merged by concat.
Bottleneck
- Reduce dim by 1x1 before expensive 3x3/5x5 conv

1.5. ResNet¶

He, Kaiming, et al. "Deep residual learning for image recognition." CVPR. 2016.

Skip Connection and Residual Net

A direct connection between 2 non-consecutive layers
No gradient vanishing
Parameters are optimized to learn a residual, that is the diﬀerence between the value before the block and the one needed after.
A skip connection is a connection that bypasses at least one layer.
Here, it is often used to transfer local information by concatenating or summing feature maps from the downsampling path with feature maps from the upsampling path.
Merging features from various resolution levels helps combining context information with spatial information.

def residual_net(x):
    conv1 = tf.keras.layers.Conv2D(filters = 32,
                                   kernel_size = (3, 3),
                                   padding = "SAME",
                                   activation = 'relu')(x)

    conv2 = tf.keras.layers.Conv2D(filters = 32,
                                   kernel_size = (3, 3),
                                   padding = "SAME",
                                   activation = 'relu')(conv1)

    maxp2 = tf.keras.layers.MaxPool2D(pool_size = (2, 2),
                                      strides = 2)(conv2 + x)

    flat = tf.keras.layers.Flatten()(maxp2)

    hidden = tf.keras.layers.Dense(units = n_hidden,
                                   activation='relu')(flat)

    output = tf.keras.layers.Dense(units = n_output)(hidden)

    return output

1.6. DenseNets¶

1.7 U-Net¶

The U-Net owes its name to its symmetric shape
The U-Net architecture is built upon the Fully Convolutional Network and modified in a way that it yields better segmentation in medical imaging.
Compared to FCN-8, the two main differences are
- U-net is symmetric and
- the skip connections between the downsampling path and the upsampling path apply a concatenation operator instead of a sum.
These skip connections intend to provide local information to the global information while upsampling. Because of its symmetry, the network has a large number of feature maps in the upsampling path, which allows to transfer information.

2. Load Pre-trained Models¶

2.1. List of Available Models¶

VGG16
VGG19
ResNet
GoogLeNet/Inception
DenseNet
MobileNet

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import cv2

%matplotlib inline

3. Model Selection¶

# model_type = tf.keras.applications.densenet
# model_type = tf.keras.applications.inception_resnet_v2
# model_type = tf.keras.applications.inception_v3
model_type = tf.keras.applications.mobilenet
# model_type = tf.keras.applications.mobilenet_v2
# model_type = tf.keras.applications.nasnet
# model_type = tf.keras.applications.resnet50
# model_type = tf.keras.applications.vgg16
# model_type = tf.keras.applications.vgg19

3.1. Model Summary¶

model = model_type.MobileNet() # Change Model (hint : use capital name)

model.summary()

Model: "mobilenet_1.00_224"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 conv1 (Conv2D)              (None, 112, 112, 32)      864       
                                                                 
 conv1_bn (BatchNormalizati  (None, 112, 112, 32)      128       
 on)                                                             
                                                                 
 conv1_relu (ReLU)           (None, 112, 112, 32)      0         
                                                                 
 conv_dw_1 (DepthwiseConv2D  (None, 112, 112, 32)      288       
 )                                                               
                                                                 
 conv_dw_1_bn (BatchNormali  (None, 112, 112, 32)      128       
 zation)                                                         
                                                                 
 conv_dw_1_relu (ReLU)       (None, 112, 112, 32)      0         
                                                                 
 conv_pw_1 (Conv2D)          (None, 112, 112, 64)      2048      
                                                                 
 conv_pw_1_bn (BatchNormali  (None, 112, 112, 64)      256       
 zation)                                                         
                                                                 
 conv_pw_1_relu (ReLU)       (None, 112, 112, 64)      0         
                                                                 
 conv_pad_2 (ZeroPadding2D)  (None, 113, 113, 64)      0         
                                                                 
 conv_dw_2 (DepthwiseConv2D  (None, 56, 56, 64)        576       
 )                                                               
                                                                 
 conv_dw_2_bn (BatchNormali  (None, 56, 56, 64)        256       
 zation)                                                         
                                                                 
 conv_dw_2_relu (ReLU)       (None, 56, 56, 64)        0         
                                                                 
 conv_pw_2 (Conv2D)          (None, 56, 56, 128)       8192      
                                                                 
 conv_pw_2_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_pw_2_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_dw_3 (DepthwiseConv2D  (None, 56, 56, 128)       1152      
 )                                                               
                                                                 
 conv_dw_3_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_dw_3_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_pw_3 (Conv2D)          (None, 56, 56, 128)       16384     
                                                                 
 conv_pw_3_bn (BatchNormali  (None, 56, 56, 128)       512       
 zation)                                                         
                                                                 
 conv_pw_3_relu (ReLU)       (None, 56, 56, 128)       0         
                                                                 
 conv_pad_4 (ZeroPadding2D)  (None, 57, 57, 128)       0         
                                                                 
 conv_dw_4 (DepthwiseConv2D  (None, 28, 28, 128)       1152      
 )                                                               
                                                                 
 conv_dw_4_bn (BatchNormali  (None, 28, 28, 128)       512       
 zation)                                                         
                                                                 
 conv_dw_4_relu (ReLU)       (None, 28, 28, 128)       0         
                                                                 
 conv_pw_4 (Conv2D)          (None, 28, 28, 256)       32768     
                                                                 
 conv_pw_4_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_pw_4_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_dw_5 (DepthwiseConv2D  (None, 28, 28, 256)       2304      
 )                                                               
                                                                 
 conv_dw_5_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_dw_5_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_pw_5 (Conv2D)          (None, 28, 28, 256)       65536     
                                                                 
 conv_pw_5_bn (BatchNormali  (None, 28, 28, 256)       1024      
 zation)                                                         
                                                                 
 conv_pw_5_relu (ReLU)       (None, 28, 28, 256)       0         
                                                                 
 conv_pad_6 (ZeroPadding2D)  (None, 29, 29, 256)       0         
                                                                 
 conv_dw_6 (DepthwiseConv2D  (None, 14, 14, 256)       2304      
 )                                                               
                                                                 
 conv_dw_6_bn (BatchNormali  (None, 14, 14, 256)       1024      
 zation)                                                         
                                                                 
 conv_dw_6_relu (ReLU)       (None, 14, 14, 256)       0         
                                                                 
 conv_pw_6 (Conv2D)          (None, 14, 14, 512)       131072    
                                                                 
 conv_pw_6_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_6_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_7 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_7_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_7_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_7 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_7_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_7_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_8 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_8_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_8_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_8 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_8_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_8_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_9 (DepthwiseConv2D  (None, 14, 14, 512)       4608      
 )                                                               
                                                                 
 conv_dw_9_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_dw_9_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_pw_9 (Conv2D)          (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_9_bn (BatchNormali  (None, 14, 14, 512)       2048      
 zation)                                                         
                                                                 
 conv_pw_9_relu (ReLU)       (None, 14, 14, 512)       0         
                                                                 
 conv_dw_10 (DepthwiseConv2  (None, 14, 14, 512)       4608      
 D)                                                              
                                                                 
 conv_dw_10_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_dw_10_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pw_10 (Conv2D)         (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_10_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_pw_10_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_dw_11 (DepthwiseConv2  (None, 14, 14, 512)       4608      
 D)                                                              
                                                                 
 conv_dw_11_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_dw_11_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pw_11 (Conv2D)         (None, 14, 14, 512)       262144    
                                                                 
 conv_pw_11_bn (BatchNormal  (None, 14, 14, 512)       2048      
 ization)                                                        
                                                                 
 conv_pw_11_relu (ReLU)      (None, 14, 14, 512)       0         
                                                                 
 conv_pad_12 (ZeroPadding2D  (None, 15, 15, 512)       0         
 )                                                               
                                                                 
 conv_dw_12 (DepthwiseConv2  (None, 7, 7, 512)         4608      
 D)                                                              
                                                                 
 conv_dw_12_bn (BatchNormal  (None, 7, 7, 512)         2048      
 ization)                                                        
                                                                 
 conv_dw_12_relu (ReLU)      (None, 7, 7, 512)         0         
                                                                 
 conv_pw_12 (Conv2D)         (None, 7, 7, 1024)        524288    
                                                                 
 conv_pw_12_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_pw_12_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 conv_dw_13 (DepthwiseConv2  (None, 7, 7, 1024)        9216      
 D)                                                              
                                                                 
 conv_dw_13_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_dw_13_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 conv_pw_13 (Conv2D)         (None, 7, 7, 1024)        1048576   
                                                                 
 conv_pw_13_bn (BatchNormal  (None, 7, 7, 1024)        4096      
 ization)                                                        
                                                                 
 conv_pw_13_relu (ReLU)      (None, 7, 7, 1024)        0         
                                                                 
 global_average_pooling2d (  (None, 1, 1, 1024)        0         
 GlobalAveragePooling2D)                                         
                                                                 
 dropout (Dropout)           (None, 1, 1, 1024)        0         
                                                                 
 conv_preds (Conv2D)         (None, 1, 1, 1000)        1025000   
                                                                 
 reshape_2 (Reshape)         (None, 1000)              0         
                                                                 
 predictions (Activation)    (None, 1000)              0         
                                                                 
=================================================================
Total params: 4253864 (16.23 MB)
Trainable params: 4231976 (16.14 MB)
Non-trainable params: 21888 (85.50 KB)
_________________________________________________________________

4. ImageNet¶

Download image 1
Download image 2

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

# img = cv2.imread('/content/drive/MyDrive/DL_Colab/DL_data/ILSVRC2017_test_00000005.JPEG')
img = cv2.imread('/content/drive/MyDrive/DL_Colab/DL_data/ILSVRC2017_test_00005381.JPEG')

print(img.shape)

plt.figure(figsize = (6, 6))
plt.imshow(img)
plt.axis('off')
plt.show()

(500, 333, 3)

resized_img = cv2.resize(img, (224, 224)).reshape(1, 224, 224, 3)

plt.figure(figsize = (6, 6))
plt.imshow(resized_img[0])
plt.axis('off')
plt.show()

input_img = model_type.preprocess_input(resized_img)

pred = model.predict(input_img, verbose = 0)
label = model_type.decode_predictions(pred)[0]

print('%s (%.2f%%)\n' % (label[0][1], label[0][2]*100))
print('%s (%.2f%%)\n' % (label[1][1], label[1][2]*100))
print('%s (%.2f%%)\n' % (label[2][1], label[2][2]*100))
print('%s (%.2f%%)\n' % (label[3][1], label[3][2]*100))
print('%s (%.2f%%)\n' % (label[4][1], label[4][2]*100))

soccer_ball (92.07%)

knee_pad (2.68%)

football_helmet (2.44%)

ballplayer (1.17%)

tennis_ball (0.49%)

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')