Fully Convolutional Networks (FCN)

By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Table of Contents

Source

1. Convolutional Autoencoder¶

from IPython.display import YouTubeVideo
YouTubeVideo('sKDv7yp3Jdk?si=zflwWs4XaLljHfNC', width = "560", height = "315")

1.1. 2D Convolution¶

tf.keras.layers.Conv2D(filters, kernel_size, strides, padding, activation, kernel_regularizer, input_shape)
    filters = 32
    kernel_size = (3,3)
    strides = (1,1)
    padding = 'SAME'
    activeation='relu'
    kernel_regularizer=tf.keras.regularizers.l2(0.04)
    input_shape = tensor of shape([input_h, input_w, input_ch])

filter size
- the number of channels.
kernel_size
- the height and width of the 2D convolution window.
stride
- the step size of the kernel when traversing the image.
padding
- how the border of a sample is handled.
- A padded convolution will keep the spatial output dimensions equal to the input, whereas unpadded convolutions will crop away some of the borders if the kernel is larger than 1.
- 'SAME' : enable zero padding
- 'VALID' : disable zero padding
activation
- Activation function to use.
kernel_regularizer
- Initializer for the kernel weights matrix.
input and output channels
- A convolutional layer takes a certain number of input channels ($C$) and calculates a specific number of output channels ($D$).

Examples

input = [None, 4, 4, 1]
filter size = [3, 3, 1, 1]
strides = [1, 1, 1, 1]
padding = 'VALID'

input = [None, 5, 5, 1]
filter size = [3, 3, 1, 1]
strides = [1, 1, 1, 1]
padding = 'SAME'

1.2. Transposed Convolution¶

The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. For instance, one might use such a transformation as the decoding layer of a convolutional autoencoder or to project feature maps to a higher-dimensional space.
Some sources use the name deconvolution, which is inappropriate because it’s not a deconvolution. To make things worse deconvolutions do exists, but they’re not common in the field of deep learning.
An actual deconvolution reverts the process of a convolution.
Imagine inputting an image into a single convolutional layer. Now take the output, throw it into a black box and out comes your original image again. This black box does a deconvolution. It is the mathematical inverse of what a convolutional layer does.
A transposed convolution is somewhat similar because it produces the same spatial resolution a hypothetical deconvolutional layer would. However, the actual mathematical operation that’s being performed on the values is different.
A transposed convolutional layer carries out a regular convolution but reverts its spatial transformation.

tf.keras.layers.Conv2DTranspose(filters, kernel_size, strides, padding = 'SAME', activation)
    filter = number of channels/ 64
    kernel_size = tensor of shape (3,3)
    strides = stride of the sliding window for each dimension of the input tensor
    padding = 'SAME'
    activation = activation functions('softmax', 'relu' ...)

'SAME' : enable zero padding
'VALID' : disable zero padding

An image of 5x5 is fed into a convolutional layer. The stride is set to 2, the padding is deactivated and the kernel is 3x3. This results in a 2x2 image.

2D convolution with no padding, no stride and kernel of 3

If we wanted to reverse this process, we’d need the inverse mathematical operation so that 9 values are generated from each pixel we input. Afterward, we traverse the output image with a stride of 2. This would be a deconvolution.

A transposed convolution does not do that. The only thing in common is it guarantees that the output will be a 5x5 image as well, while still performing a normal convolution operation. To achieve this, we need to perform some fancy padding on the input.

Transposed 2D convolution with no padding, stride of 2 and kernel of 3

It merely reconstructs the spatial resolution from before and performs a convolution. This may not be the mathematical inverse, but for Encoder-Decoder architectures, it’s still very helpful. This way we can combine the upscaling of an image with a convolution, instead of doing two separate processes.

Another example of transposed convolution

Transposed 2D convolution with no padding, no stride and kernel of 3

Strides and padding for transposed convolution (optional)

Source
- A guide to convolution arithmetic for deep learning by Vincent Dumoulin and Francesco Visin
- https://github.com/vdumoulin/conv_arithmetic

1.3. Examples¶

A transposed 2-D convolution layer upsamples feature maps.
This layer is sometimes incorrectly known as a "deconvolution" or "deconv" layer. This layer is the transpose of convolution and does not perform deconvolution.

%%html
<iframe src="https://www.youtube.com/embed/nTt_ajul8NY?start=725"
width="560" height="315" frameborder="0" allowfullscreen></iframe>

1.4. CAE with MNIST¶

Import Library

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

Load MNIST Data

# Load Data

mnist = tf.keras.datasets.mnist
(train_imgs, train_labels), (test_imgs, test_labels) = mnist.load_data()

train_imgs, test_imgs = train_imgs/255.0, test_imgs/255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 0s 0us/step

Only use (1, 5, 6) digits to visualize latent space in 2-D

# Use Only 1,5,6 Digits to Visualize

train_x = train_imgs[np.hstack([np.where(train_labels == 1),
                                np.where(train_labels == 5),
                                np.where(train_labels == 6)])][0]
train_y = train_labels[np.hstack([np.where(train_labels == 1),
                                  np.where(train_labels == 5),
                                  np.where(train_labels == 6)])][0]
test_x = test_imgs[np.hstack([np.where(test_labels == 1),
                              np.where(test_labels == 5),
                              np.where(test_labels == 6)])][0]
test_y = test_labels[np.hstack([np.where(test_labels == 1),
                                np.where(test_labels == 5),
                                np.where(test_labels == 6)])][0]

train_x = train_x.reshape(-1,28,28,1)
test_x = test_x.reshape(-1,28,28,1)

The following architecture has been implemented.

Build a Model

encoder = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(filters = 32,
                           kernel_size = (3,3),
                           strides = (2,2),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (28, 28, 1)),

    tf.keras.layers.Conv2D(filters = 64,
                           kernel_size = (3,3),
                           strides = (2,2),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (14, 14, 32)),

    tf.keras.layers.Conv2D(filters = 2,
                           kernel_size = (7,7),
                           padding = 'VALID',
                           input_shape = (7,7,64))
])

decoder = tf.keras.models.Sequential([
    tf.keras.layers.Conv2DTranspose(filters = 64,
                                    kernel_size = (7,7),
                                    strides = (1,1),
                                    activation = 'relu',
                                    padding = 'VALID',
                                    input_shape = (1, 1, 2)),

    tf.keras.layers.Conv2DTranspose(filters = 32,
                                    kernel_size = (3,3),
                                    strides = (2,2),
                                    activation = 'relu',
                                    padding = 'SAME',
                                    input_shape = (7, 7, 64)),

    tf.keras.layers.Conv2DTranspose(filters = 1,
                                    kernel_size = (7,7),
                                    strides = (2,2),
                                    padding = 'SAME',
                                    input_shape = (14,14,32))
])

latent = encoder.output
result = decoder(latent)

model = tf.keras.Model(inputs = encoder.input, outputs = result)

Define Loss and Optimizer

model.compile(optimizer = 'adam',
              loss = 'mean_squared_error')

Define Optimization Configuration and Then Optimize

model.fit(train_x, train_x, epochs = 10)

Epoch 1/10
566/566 [==============================] - 15s 6ms/step - loss: 0.0435
Epoch 2/10
566/566 [==============================] - 3s 4ms/step - loss: 0.0340
Epoch 3/10
566/566 [==============================] - 3s 6ms/step - loss: 0.0320
Epoch 4/10
566/566 [==============================] - 4s 7ms/step - loss: 0.0309
Epoch 5/10
566/566 [==============================] - 4s 8ms/step - loss: 0.0302
Epoch 6/10
566/566 [==============================] - 4s 7ms/step - loss: 0.0295
Epoch 7/10
566/566 [==============================] - 5s 9ms/step - loss: 0.0290
Epoch 8/10
566/566 [==============================] - 4s 8ms/step - loss: 0.0287
Epoch 9/10
566/566 [==============================] - 4s 8ms/step - loss: 0.0283
Epoch 10/10
566/566 [==============================] - 5s 8ms/step - loss: 0.0281

<keras.src.callbacks.History at 0x785ab231c490>

test_img = test_x[[6]]
x_reconst = model.predict(test_img)

plt.figure(figsize = (6, 4))
plt.subplot(1,2,1)
plt.imshow(test_img.reshape(28,28), 'gray')
plt.title('Input image', fontsize = 15)
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(x_reconst.reshape(28,28), 'gray')
plt.title('Reconstructed image', fontsize = 15)
plt.axis('off')
plt.show()

1/1 [==============================] - 0s 123ms/step

idx = np.random.choice(test_y.shape[0], 500)
rnd_x, rnd_y = test_x[idx], test_y[idx]

rnd_latent = encoder.predict(rnd_x)
rnd_latent = rnd_latent.reshape(-1,2)

plt.figure(figsize = (6, 6))
plt.scatter(rnd_latent[rnd_y == 1, 0], rnd_latent[rnd_y == 1, 1], label = '1')
plt.scatter(rnd_latent[rnd_y == 5, 0], rnd_latent[rnd_y == 5, 1], label = '5')
plt.scatter(rnd_latent[rnd_y == 6, 0], rnd_latent[rnd_y == 6, 1], label = '6')
plt.title('Latent Space', fontsize = 15)
plt.xlabel('Z1', fontsize = 15)
plt.ylabel('Z2', fontsize = 15)
plt.legend(fontsize = 15)
plt.axis('equal')
plt.show()

16/16 [==============================] - 0s 8ms/step

new_latent = np.array([[-8, 0]]).reshape(-1,1,1,2)

fake_img = decoder.predict(new_latent)

plt.figure(figsize = (9, 4))
plt.subplot(1,2,1)
plt.scatter(rnd_latent[rnd_y == 1, 0], rnd_latent[rnd_y == 1, 1], label = '1')
plt.scatter(rnd_latent[rnd_y == 5, 0], rnd_latent[rnd_y == 5, 1], label = '5')
plt.scatter(rnd_latent[rnd_y == 6, 0], rnd_latent[rnd_y == 6, 1], label = '6')
plt.scatter(new_latent[:,:,:,0], new_latent[:,:,:,1], c = 'k', marker = 'o', s = 200, label = 'new data')
plt.title('Latent Space', fontsize = 15)
plt.xlabel('Z1', fontsize = 15)
plt.ylabel('Z2', fontsize = 15)
plt.legend(loc = 2, fontsize = 12)
plt.axis('equal')
plt.subplot(1,2,2)
plt.imshow(fake_img.reshape(28,28), 'gray')
plt.title('Generated Fake Image', fontsize = 15)
plt.xticks([])
plt.yticks([])
plt.show()

1/1 [==============================] - 0s 29ms/step

2. Fully Convolutional Networks for Segmentation¶

from IPython.display import YouTubeVideo
YouTubeVideo('sKDv7yp3Jdk?si=msi2nCF34udI3bUj&amp;start=1230', width = "560", height = "315")

2.1. Segmentation¶

Segmentation task is different from classification task because it requires predicting a class for each pixel of the input image, instead of only 1 class for the whole input.
Classification needs to understand what is in the input (namely, the context).
However, in order to predict what is in the input for each pixel, segmentation needs to recover not only what is in the input, but also where.
Segment images into regions with different semantic categories. These semantic regions label and predict objects at the pixel level

2.2. Fully Convolutional Networks (FCN)¶

FCN is built only from locally connected layers, such as convolution, pooling and upsampling.
Note that no dense layer is used in this kind of architecture.
Network can work regardless of the original image size, without requiring any fixed number of units at any stage.
To obtain a segmentation map (output), segmentation networks usually have 2 parts
- Downsampling path: capture semantic/contextual information
- Upsampling path: recover spatial information
The downsampling path is used to extract and interpret the context (what), while the upsampling path is used to enable precise localization (where).
Furthermore, to fully recover the fine-grained spatial information lost in the pooling or downsampling layers, we often use skip connections.
Given a position on the spatial dimension, the output of the channel dimension will be a category prediction of the pixel corresponding to the location.

2.3. Supervised Learning for Segmentation¶

2.3.1. Segmented (Labeled) Images¶

Download data

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

seg_train_imgs = np.load('/content/drive/MyDrive/DL/DL_data/seg_train_imgs.npy')
seg_train_labels = np.load('/content/drive/MyDrive/DL/DL_data/seg_train_labels.npy')
seg_test_imgs = np.load('/content/drive/MyDrive/DL/DL_data/seg_test_imgs.npy')

n_train = seg_train_imgs.shape[0]
n_test = seg_train_imgs.shape[0]

print ("The number of training images  : {}, shape : {}".format(n_train, seg_train_imgs.shape))
print ("The number of segmented images : {}, shape : {}".format(n_train, seg_train_labels.shape))
print ("The number of testing images   : {}, shape : {}".format(n_test, seg_test_imgs.shape))

The number of training images  : 180, shape : (180, 224, 224, 3)
The number of segmented images : 180, shape : (180, 224, 224, 2)
The number of testing images   : 180, shape : (27, 224, 224, 3)

## binary segmentation and one-hot encoding in this case

idx = np.random.randint(n_train)

plt.figure(figsize = (10, 4))
plt.subplot(1,3,1)
plt.imshow(seg_train_imgs[idx])
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(seg_train_labels[idx][:,:,0])
plt.axis('off')
plt.subplot(1,3,3)
plt.imshow(seg_train_labels[idx][:,:,1])
plt.axis('off')
plt.show()

2.3.2. From CAE to FCN¶

CAE

FCN
- VGG16
- Skip connections to fully recover the fine-grained spatial information lost in the pooling or downsampling layers

2.4. FCN Implementation¶

Utilize VGG16 Model for Encoder

model_type = tf.keras.applications.vgg16
base_model = model_type.VGG16()
base_model.trainable = False
base_model.summary()

Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 fc1 (Dense)                 (None, 4096)              102764544 
                                                                 
 fc2 (Dense)                 (None, 4096)              16781312  
                                                                 
 predictions (Dense)         (None, 1000)              4097000   
                                                                 
=================================================================
Total params: 138357544 (527.79 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 138357544 (527.79 MB)
_________________________________________________________________

Build a FCN Model

tf.layers are used to define upsampling parts

map5 = base_model.layers[-5].output

# sixth convolution layer
conv6 = tf.keras.layers.Conv2D(filters = 4096,
                               kernel_size = (7,7),
                               padding = 'SAME',
                               activation = 'relu')(map5)

# 1x1 convolution layers
fcn4 = tf.keras.layers.Conv2D(filters = 4096,
                              kernel_size = (1,1),
                              padding = 'SAME',
                              activation = 'relu')(conv6)

fcn3 = tf.keras.layers.Conv2D(filters = 2,
                              kernel_size = (1,1),
                              padding = 'SAME',
                              activation = 'relu')(fcn4)

# Upsampling layers
fcn2 =  tf.keras.layers.Conv2DTranspose(filters = 512,
                                        kernel_size = (4,4),
                                        strides = (2,2),
                                        padding = 'SAME')(fcn3)

fcn1 =  tf.keras.layers.Conv2DTranspose(filters = 256,
                                        kernel_size = (4,4),
                                        strides = (2,2),
                                        padding = 'SAME')(fcn2 + base_model.layers[14].output)

output =  tf.keras.layers.Conv2DTranspose(filters = 2,
                                          kernel_size = (16,16),
                                          strides = (8,8),
                                          padding = 'SAME',
                                          activation = 'softmax')(fcn1 + base_model.layers[10].output)

model = tf.keras.Model(inputs = base_model.inputs, outputs = output)

model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]        0         []                            
                                                                                                  
 block1_conv1 (Conv2D)       (None, 224, 224, 64)         1792      ['input_1[0][0]']             
                                                                                                  
 block1_conv2 (Conv2D)       (None, 224, 224, 64)         36928     ['block1_conv1[0][0]']        
                                                                                                  
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)         0         ['block1_conv2[0][0]']        
                                                                                                  
 block2_conv1 (Conv2D)       (None, 112, 112, 128)        73856     ['block1_pool[0][0]']         
                                                                                                  
 block2_conv2 (Conv2D)       (None, 112, 112, 128)        147584    ['block2_conv1[0][0]']        
                                                                                                  
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)          0         ['block2_conv2[0][0]']        
                                                                                                  
 block3_conv1 (Conv2D)       (None, 56, 56, 256)          295168    ['block2_pool[0][0]']         
                                                                                                  
 block3_conv2 (Conv2D)       (None, 56, 56, 256)          590080    ['block3_conv1[0][0]']        
                                                                                                  
 block3_conv3 (Conv2D)       (None, 56, 56, 256)          590080    ['block3_conv2[0][0]']        
                                                                                                  
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)          0         ['block3_conv3[0][0]']        
                                                                                                  
 block4_conv1 (Conv2D)       (None, 28, 28, 512)          1180160   ['block3_pool[0][0]']         
                                                                                                  
 block4_conv2 (Conv2D)       (None, 28, 28, 512)          2359808   ['block4_conv1[0][0]']        
                                                                                                  
 block4_conv3 (Conv2D)       (None, 28, 28, 512)          2359808   ['block4_conv2[0][0]']        
                                                                                                  
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)          0         ['block4_conv3[0][0]']        
                                                                                                  
 block5_conv1 (Conv2D)       (None, 14, 14, 512)          2359808   ['block4_pool[0][0]']         
                                                                                                  
 block5_conv2 (Conv2D)       (None, 14, 14, 512)          2359808   ['block5_conv1[0][0]']        
                                                                                                  
 block5_conv3 (Conv2D)       (None, 14, 14, 512)          2359808   ['block5_conv2[0][0]']        
                                                                                                  
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)            0         ['block5_conv3[0][0]']        
                                                                                                  
 conv2d (Conv2D)             (None, 7, 7, 4096)           1027645   ['block5_pool[0][0]']         
                                                          44                                      
                                                                                                  
 conv2d_1 (Conv2D)           (None, 7, 7, 4096)           1678131   ['conv2d[0][0]']              
                                                          2                                       
                                                                                                  
 conv2d_2 (Conv2D)           (None, 7, 7, 2)              8194      ['conv2d_1[0][0]']            
                                                                                                  
 conv2d_transpose (Conv2DTr  (None, 14, 14, 512)          16896     ['conv2d_2[0][0]']            
 anspose)                                                                                         
                                                                                                  
 tf.__operators__.add (TFOp  (None, 14, 14, 512)          0         ['conv2d_transpose[0][0]',    
 Lambda)                                                             'block4_pool[0][0]']         
                                                                                                  
 conv2d_transpose_1 (Conv2D  (None, 28, 28, 256)          2097408   ['tf.__operators__.add[0][0]']
 Transpose)                                                                                       
                                                                                                  
 tf.__operators__.add_1 (TF  (None, 28, 28, 256)          0         ['conv2d_transpose_1[0][0]',  
 OpLambda)                                                           'block3_pool[0][0]']         
                                                                                                  
 conv2d_transpose_2 (Conv2D  (None, 224, 224, 2)          131074    ['tf.__operators__.add_1[0][0]
 Transpose)                                                         ']                            
                                                                                                  
==================================================================================================
Total params: 136514116 (520.76 MB)
Trainable params: 121799428 (464.63 MB)
Non-trainable params: 14714688 (56.13 MB)
__________________________________________________________________________________________________

Training

model.compile(optimizer = 'adam',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

model.fit(seg_train_imgs, seg_train_labels, batch_size = 5, epochs = 5)

Epoch 1/5
36/36 [==============================] - 21s 206ms/step - loss: 0.4534 - accuracy: 0.8714
Epoch 2/5
36/36 [==============================] - 7s 200ms/step - loss: 0.2277 - accuracy: 0.9138
Epoch 3/5
36/36 [==============================] - 7s 200ms/step - loss: 0.2077 - accuracy: 0.9197
Epoch 4/5
36/36 [==============================] - 7s 200ms/step - loss: 0.2020 - accuracy: 0.9220
Epoch 5/5
36/36 [==============================] - 7s 201ms/step - loss: 0.1909 - accuracy: 0.9260

<keras.src.callbacks.History at 0x7fcf20508fa0>

Testing

test_img = seg_test_imgs[[1]]
test_segmented = model.predict(test_img)

seg_mask = (test_segmented[:,:,:,1] > 0.5).reshape(224, 224, 1).astype(float)

plt.figure(figsize = (8,8))
plt.subplot(2,2,1)
plt.imshow(test_img[0])
plt.axis('off')
plt.subplot(2,2,2)
plt.imshow(seg_mask, cmap = 'Blues')
plt.axis('off')
plt.subplot(2,2,3)
plt.imshow(test_img[0])
plt.imshow(seg_mask, cmap = 'Blues', alpha = 0.5)
plt.axis('off')
plt.show()

1/1 [==============================] - 1s 1s/step

3. Super-resolution and Deblurring¶

from IPython.display import YouTubeVideo
YouTubeVideo('7h91Q94E7aw?si=_jEnWdl_Hw3hBx90&amp;start=511', width = "560", height = "315")

3.1. Image Restoration¶

Image restoration tries to recover original image from degraded one with prior knowledge of degradation process.
The sources of corruption in digital images arise during image acquisition (digitization) and transmission.
- Imaging sensors can be affected by ambient conditions.
- Interference can be added to an image during transmission.

3.2. Inverse Problem¶

The reconstruction is the inverse of the acquisition.
Inverse problems involve modeling of degradation and applying the inverse process in order to recover the original image from inadequate observations.
The observations contain incomplete information about the target parameter or data due to physical limitations of the measurement devices.
Consequently, solutions to inverse problems are non-unique.

3.3. Image Super-resolution¶

Restore High Resolution (HR) image from Low Resolution (LR) image

There are numerous learning-based SR approaches.

HR and LR Images

Download data from here

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

train_lr = np.load('/content/drive/MyDrive/DL/DL_data/SR_train_lr.npy')
train_hr = np.load('/content/drive/MyDrive/DL/DL_data/SR_train_hr.npy')
test_lr = np.load('/content/drive/MyDrive/DL/DL_data/SR_test_lr.npy')

n_train = train_lr.shape[0]
n_test = test_lr.shape[0]

print ("The number of training LR images : {}, shape : {}".format(n_train, train_lr.shape))
print ("The number of training HR images : {}, shape : {}".format(n_train, train_hr.shape))
print ("The number of testing LR images  : {}, shape : {}".format(n_test, test_lr.shape))

The number of training LR images : 79, shape : (79, 112, 112, 1)
The number of training HR images : 79, shape : (79, 224, 224, 1)
The number of testing LR images  : 20, shape : (20, 112, 112, 1)

idx = np.random.randint(n_train)

plt.figure(figsize = (8, 6))
plt.subplot(1,2,1)
plt.imshow(train_lr[idx][:,:,0], 'gray')
plt.title('Low-resolution image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(train_hr[idx][:,:,0], 'gray')
plt.title('High-resolution image')
plt.axis('off')
plt.show()

Build a FCN Model

inputs = tf.keras.Input(shape = (112, 112, 1))

# 3x3 convolutional layer
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(inputs)

# first residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# second residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# third residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# upsampling layer
x = tf.keras.layers.Conv2DTranspose(filters = 16,
                                    kernel_size = (4,4),
                                    strides = (2,2),
                                    padding = 'SAME',
                                    activation = 'relu')(x)

# 3x3 convolutional layer
outputs = tf.keras.layers.Conv2D(filters = 1,
                                 kernel_size = (3,3),
                                 padding = 'SAME',
                                 activation = 'sigmoid')(x)

model = tf.keras.Model(inputs, outputs)

Training

model.compile(optimizer = 'adam',
              loss = 'mean_absolute_error',
              metrics = ['mean_squared_error'])

model.fit(train_lr, train_hr, batch_size = 16, epochs = 30)

Epoch 1/30
5/5 [==============================] - 14s 143ms/step - loss: 0.1640 - mean_squared_error: 0.0454
Epoch 2/30
5/5 [==============================] - 0s 30ms/step - loss: 0.1558 - mean_squared_error: 0.0480
Epoch 3/30
5/5 [==============================] - 0s 30ms/step - loss: 0.1521 - mean_squared_error: 0.0449
Epoch 4/30
5/5 [==============================] - 0s 31ms/step - loss: 0.1479 - mean_squared_error: 0.0425
Epoch 5/30
5/5 [==============================] - 0s 30ms/step - loss: 0.1425 - mean_squared_error: 0.0403
Epoch 6/30
5/5 [==============================] - 0s 30ms/step - loss: 0.1363 - mean_squared_error: 0.0349
Epoch 7/30
5/5 [==============================] - 0s 31ms/step - loss: 0.1267 - mean_squared_error: 0.0301
Epoch 8/30
5/5 [==============================] - 0s 31ms/step - loss: 0.1195 - mean_squared_error: 0.0258
Epoch 9/30
5/5 [==============================] - 0s 29ms/step - loss: 0.1094 - mean_squared_error: 0.0211
Epoch 10/30
5/5 [==============================] - 0s 37ms/step - loss: 0.1028 - mean_squared_error: 0.0185
Epoch 11/30
5/5 [==============================] - 0s 41ms/step - loss: 0.1015 - mean_squared_error: 0.0179
Epoch 12/30
5/5 [==============================] - 0s 39ms/step - loss: 0.0997 - mean_squared_error: 0.0169
Epoch 13/30
5/5 [==============================] - 0s 36ms/step - loss: 0.0909 - mean_squared_error: 0.0143
Epoch 14/30
5/5 [==============================] - 0s 34ms/step - loss: 0.0861 - mean_squared_error: 0.0129
Epoch 15/30
5/5 [==============================] - 0s 34ms/step - loss: 0.0812 - mean_squared_error: 0.0117
Epoch 16/30
5/5 [==============================] - 0s 33ms/step - loss: 0.0767 - mean_squared_error: 0.0106
Epoch 17/30
5/5 [==============================] - 0s 36ms/step - loss: 0.0726 - mean_squared_error: 0.0097
Epoch 18/30
5/5 [==============================] - 0s 35ms/step - loss: 0.0738 - mean_squared_error: 0.0098
Epoch 19/30
5/5 [==============================] - 0s 33ms/step - loss: 0.0707 - mean_squared_error: 0.0093
Epoch 20/30
5/5 [==============================] - 0s 35ms/step - loss: 0.0738 - mean_squared_error: 0.0097
Epoch 21/30
5/5 [==============================] - 0s 34ms/step - loss: 0.0726 - mean_squared_error: 0.0095
Epoch 22/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0701 - mean_squared_error: 0.0090
Epoch 23/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0686 - mean_squared_error: 0.0087
Epoch 24/30
5/5 [==============================] - 0s 40ms/step - loss: 0.0671 - mean_squared_error: 0.0083
Epoch 25/30
5/5 [==============================] - 0s 34ms/step - loss: 0.0673 - mean_squared_error: 0.0083
Epoch 26/30
5/5 [==============================] - 0s 35ms/step - loss: 0.0679 - mean_squared_error: 0.0084
Epoch 27/30
5/5 [==============================] - 0s 34ms/step - loss: 0.0661 - mean_squared_error: 0.0081
Epoch 28/30
5/5 [==============================] - 0s 35ms/step - loss: 0.0654 - mean_squared_error: 0.0079
Epoch 29/30
5/5 [==============================] - 0s 36ms/step - loss: 0.0682 - mean_squared_error: 0.0083
Epoch 30/30
5/5 [==============================] - 0s 33ms/step - loss: 0.0672 - mean_squared_error: 0.0083

<keras.src.callbacks.History at 0x7b0d28239c30>

Testing

test_x = test_lr[[3]]
test_sr = model.predict(test_x)

plt.figure(figsize = (8, 6))
plt.subplot(1,2,1)
plt.imshow(test_x[0][:,:,0], 'gray')
plt.title('Low-resolution image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(test_sr[0][:,:,0], 'gray')
plt.title('Super-resolved image')
plt.axis('off')
plt.show()

1/1 [==============================] - 1s 734ms/step

4. Image Deblurring¶

Blurred and Deblurred Images

Download data from here

train_blur = np.load('/content/drive/MyDrive/DL_Colab/DL_data/deblurring_train_blur.npy')
train_deblur = np.load('/content/drive/MyDrive/DL_Colab/DL_data/deblurring_train_deblur.npy')
test_blur = np.load('/content/drive/MyDrive/DL_Colab/DL_data/deblurring_test_blur.npy')

n_train = train_blur.shape[0]
n_test = test_blur.shape[0]

print ("The number of training blur images   : {}, shape : {}".format(n_train, train_blur.shape))
print ("The number of training deblur images : {}, shape : {}".format(n_train, train_deblur.shape))
print ("The number of testing blur images    : {}, shape : {}".format(n_test, test_blur.shape))

The number of training blur images   : 79, shape : (79, 224, 224, 1)
The number of training deblur images : 79, shape : (79, 224, 224, 1)
The number of testing blur images    : 20, shape : (20, 224, 224, 1)

idx = np.random.randint(n_train)

plt.figure(figsize = (8, 6))
plt.subplot(1,2,1)
plt.imshow(train_blur[idx][:,:,0], 'gray')
plt.title('Blurred image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(train_deblur[idx][:,:,0], 'gray')
plt.title('Deblurred image')
plt.axis('off')
plt.show()

Build a FCN Model

inputs = tf.keras.Input(shape = (224, 224, 1))

# 3x3 convolutional layer
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(inputs)

# first residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# second residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# third residual block
x_skip = x
x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Conv2D(filters = 16,
                           kernel_size = (3,3),
                           padding = 'SAME',
                           activation = 'relu')(x)

x = tf.keras.layers.Add()([x_skip, x])

# 3x3 convolutional layer
outputs = tf.keras.layers.Conv2D(filters = 1,
                                 kernel_size = (3,3),
                                 padding = 'SAME',
                                 activation = 'sigmoid')(x)

model = tf.keras.Model(inputs, outputs)

Training

model.compile(optimizer = 'adam',
              loss ='mean_absolute_error',
              metrics = ['mean_squared_error'])

model.fit(train_blur, train_deblur, batch_size = 16, epochs = 30)

Epoch 1/30
5/5 [==============================] - 4s 198ms/step - loss: 0.1618 - mean_squared_error: 0.0480
Epoch 2/30
5/5 [==============================] - 0s 71ms/step - loss: 0.1571 - mean_squared_error: 0.0486
Epoch 3/30
5/5 [==============================] - 0s 72ms/step - loss: 0.1491 - mean_squared_error: 0.0425
Epoch 4/30
5/5 [==============================] - 0s 72ms/step - loss: 0.1393 - mean_squared_error: 0.0377
Epoch 5/30
5/5 [==============================] - 0s 74ms/step - loss: 0.1262 - mean_squared_error: 0.0301
Epoch 6/30
5/5 [==============================] - 0s 81ms/step - loss: 0.1049 - mean_squared_error: 0.0215
Epoch 7/30
5/5 [==============================] - 0s 79ms/step - loss: 0.0897 - mean_squared_error: 0.0156
Epoch 8/30
5/5 [==============================] - 0s 88ms/step - loss: 0.0801 - mean_squared_error: 0.0120
Epoch 9/30
5/5 [==============================] - 0s 82ms/step - loss: 0.0749 - mean_squared_error: 0.0105
Epoch 10/30
5/5 [==============================] - 0s 80ms/step - loss: 0.0697 - mean_squared_error: 0.0095
Epoch 11/30
5/5 [==============================] - 0s 89ms/step - loss: 0.0668 - mean_squared_error: 0.0086
Epoch 12/30
5/5 [==============================] - 0s 90ms/step - loss: 0.0636 - mean_squared_error: 0.0077
Epoch 13/30
5/5 [==============================] - 0s 81ms/step - loss: 0.0606 - mean_squared_error: 0.0071
Epoch 14/30
5/5 [==============================] - 0s 77ms/step - loss: 0.0587 - mean_squared_error: 0.0066
Epoch 15/30
5/5 [==============================] - 0s 76ms/step - loss: 0.0583 - mean_squared_error: 0.0064
Epoch 16/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0572 - mean_squared_error: 0.0061
Epoch 17/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0560 - mean_squared_error: 0.0058
Epoch 18/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0517 - mean_squared_error: 0.0052
Epoch 19/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0510 - mean_squared_error: 0.0050
Epoch 20/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0499 - mean_squared_error: 0.0048
Epoch 21/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0489 - mean_squared_error: 0.0046
Epoch 22/30
5/5 [==============================] - 0s 73ms/step - loss: 0.0514 - mean_squared_error: 0.0048
Epoch 23/30
5/5 [==============================] - 0s 73ms/step - loss: 0.0508 - mean_squared_error: 0.0047
Epoch 24/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0496 - mean_squared_error: 0.0045
Epoch 25/30
5/5 [==============================] - 0s 73ms/step - loss: 0.0483 - mean_squared_error: 0.0043
Epoch 26/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0460 - mean_squared_error: 0.0040
Epoch 27/30
5/5 [==============================] - 0s 75ms/step - loss: 0.0452 - mean_squared_error: 0.0039
Epoch 28/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0459 - mean_squared_error: 0.0040
Epoch 29/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0442 - mean_squared_error: 0.0038
Epoch 30/30
5/5 [==============================] - 0s 74ms/step - loss: 0.0431 - mean_squared_error: 0.0036

<keras.src.callbacks.History at 0x7b0cfffc50c0>

Testing

test_x = test_blur[[1]]
test_deblur = model.predict(test_x)

plt.figure(figsize = (8, 6))
plt.subplot(1,2,1)
plt.imshow(test_x[0][:,:,0], 'gray')
plt.title('Blurred image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(test_deblur[0][:,:,0], 'gray')
plt.title('Deblurred image')
plt.axis('off')
plt.show()

1/1 [==============================] - 0s 124ms/step

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')