Graph Neural Networks (GNN)

By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

Table of Contents

1. Graph ¶

abstract relations, topology, or connectivity

Graphs $G(V,E)$
- $V$: a set of vertices (nodes)
- $E$: a set of edges (links, relations)
- weight (edge property)
  - distance in a road network
  - strength of connection in a personal network

Graphs model any situation where you have objects and pairwise relations (symmetirc or asymmetirc) between the objects

Vertex	Edge
People	like each other	undirected
People	is the boss of	directed
Tasks	cannot be processed at the same time	undirected
Computers	have a direct network connection	undirected
Airports	planes flies between them	directed
City	can travel between them	directed

1.1. Type of Graph ¶

Undirected Graph vs. Directed Graph

Undirected graph
- Edges of undirected graph points both ways between nodes
- ex) Two-way road

Directed graph
- A graph in which the edges are directed
- ex) One-way raod

Weighted Graph

A graph with edges assigned costs or weights
Also called 'Network'
- ex) connection between cities, length of road, circuit element capacity, communication network usage fee, etc.

1.2. Graph Representation¶

Graph and Adjacency Matrix

Simple undirected graph consist of only nodes and edges
Graph can be represented as adjacency matrix $A$
- Adjacency matrix $A$ indicates adjacent nodes for each node

Need (number of nodes) $ \times$ (number of nodes) shape matirx to represent adjacency matirx of undirected graph
- Symmetirc matirx

1.3. Adjacent Matrix¶

Undirected graph $G = (V,E)$

$$ \begin{align*}V &= \{1,2,\cdots,7\} \\ E &= \{\{1,2\},\{1,6\},\{2,3\},\{3,4\},\{3,6\},\{3,7\},\{4,7\},\{5,6\} \} \end{align*} $$

$$\text{Adjacency list} = \begin{cases} \;\; \text{adj}(1) = \{2,6\}\\ \;\; \text{adj}(2) = \{1,3\}\\ \;\; \text{adj}(3) = \{2,4,6,7\}\\ \;\; \text{adj}(4) = \{3,7\}\\ \;\; \text{adj}(5) = \{6\}\\ \;\; \text{adj}(6) = \{1,3,5\}\\ \;\; \text{adj}(7) = \{3,4\} \end{cases}$$

$$ \text{Adjacency matrix (symmetric) } A = \begin{bmatrix} 0&1&0&0&0&1&0\\ 1&0&1&0&0&0&0\\ 0&1&0&1&0&1&1\\ 0&0&1&0&0&0&1\\ 0&0&0&0&0&1&0\\ 1&0&1&0&1&0&0\\ 0&0&1&1&0&0&0\\ \end{bmatrix}$$

Directed graph $G = (V,E)$

$$ \begin{align*} V &= \{1,2,\cdots,7\} \\ E &= \{\{1,2\},\{1,6\},\{2,3\},\{3,4\},\{3,7\},\{4,7\},\{6,3\},\{6,5\} \} \end{align*} $$

$$\text{Adjacency list} = \begin{cases} \;\; \text{adj}(1) &= \{2,6\}\\ \;\; \text{adj}(2) &= \{3\}\\ \;\; \text{adj}(3) &= \{4,7\}\\ \;\; \text{adj}(4) &= \{7\}\\ \;\; \text{adj}(5) &= \phi\\ \;\; \text{adj}(6) &= \{3,5\}\\ \;\; \text{adj}(7) &= \phi \end{cases}$$

$$ \text{Adjacency matrix (symmetric) } A = \begin{bmatrix} 0&1&0&0&0&1&0\\ 0&0&1&0&0&0&0\\ 0&0&0&1&0&0&1\\ 0&0&0&0&0&0&1\\ 0&0&0&0&0&0&0\\ 0&0&1&0&1&0&0\\ 0&0&0&0&0&0&0\\ \end{bmatrix}$$

# !pip install networkx

import networkx as nx
import matplotlib.pyplot as plt

%matplotlib inline

Graph.add_edge

g = nx.Graph()
g.add_edge('a', 'b')
g.add_edge('b', 'c')
g.add_edge('a', 'c')
g.add_edge('c', 'd')

# draw a graph with nodes and edges

nx.draw(g)
plt.show()

# draw a graph with node labels 

pos = nx.spring_layout(g)

nx.draw(g, pos, node_size = 500)
nx.draw_networkx_labels(g, pos, font_size = 10)
plt.show()

Graph.add_nodes_from

Graph.add_edges_from

G = nx.Graph()

G.add_nodes_from([1, 2, 3, 4])
G.add_edges_from([(1,2), (1,3), (2,3), (3,4)])  

# plot a graph 
pos = nx.spring_layout(G)

nx.draw(G, pos, node_size = 500)
nx.draw_networkx_labels(G, pos, font_size = 10)
plt.show()

print(nx.number_of_nodes(G))
print(nx.number_of_edges(G))
print(G.nodes())
print(G.edges())

4
4
[1, 2, 3, 4]
[(1, 2), (1, 3), (2, 3), (3, 4)]

A = nx.adjacency_matrix(G)

print(A)
print(A.todense())

  (0, 1)	1
  (0, 2)	1
  (1, 0)	1
  (1, 2)	1
  (2, 0)	1
  (2, 1)	1
  (2, 3)	1
  (3, 2)	1
[[0 1 1 0]
 [1 0 1 0]
 [1 1 0 1]
 [0 0 1 0]]

1.4. Degree¶

Degree of Undirected Graph

the degree of vertex in a graph is the number of edges connected to it
denote the degree of vertex $i$ by $d_{i}$
for an undirected graph of $n$ vertices

$$ d_i = \sum_{j=1}^{n} \; A_{ij} $$

Degree matrix $D$ of adjacent matrix $A$

$$D = \text{diag}\{d_1, d_2, \cdots \}$$

Example

$$A = \begin{bmatrix} 0 & 1 & 1 & 1 \\ 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \end{bmatrix} \qquad \Rightarrow \qquad D = \begin{bmatrix} 3 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 2 \end{bmatrix} $$

1.5. Self-connecting Edges¶

$$A = \begin{bmatrix} 0 & 1 & 1 & 1 \\ 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \end{bmatrix} \qquad \Rightarrow \qquad A+I = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 \\ 1 & 0 & 1 & 1 \end{bmatrix} \qquad \Rightarrow \qquad \tilde D = \begin{bmatrix} 4 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 3 \end{bmatrix} $$

1.6. Neighborhood Normalization¶

Some nodes have many edges, but some don't

Adding $I$ is to add self-connecting edges
Considering neighboring nodes in the normalized weights
To prevent numerical instabilities and vanishing/exploding gradients in order for the model to converge

1) (First attempt) Normalized $\tilde A$

$$\tilde A = \tilde D^{-1}(A+I)$$

2) Normalized $\tilde A$

$$\tilde A = \tilde D^{-1/2}(A+I) \tilde D^{-1/2}$$

import numpy as np

A = np.array([[0,1,1,1],
              [1,0,0,0],
              [1,0,0,1],
              [1,0,1,0]])

A_self = A + np.eye(4)

print(A_self)

[[1. 1. 1. 1.]
 [1. 1. 0. 0.]
 [1. 0. 1. 1.]
 [1. 0. 1. 1.]]

D = np.array(A_self.sum(1)).flatten()
D = np.diag(D)

print(D)

[[4. 0. 0. 0.]
 [0. 2. 0. 0.]
 [0. 0. 3. 0.]
 [0. 0. 0. 3.]]

1) (First attempt) Normalized $\tilde A$

$$\tilde A = \tilde D^{-1}(A+I)$$

It is not symmetric.

A_norm = np.linalg.inv(D).dot(A_self)

print(A_norm)

[[0.25       0.25       0.25       0.25      ]
 [0.5        0.5        0.         0.        ]
 [0.33333333 0.         0.33333333 0.33333333]
 [0.33333333 0.         0.33333333 0.33333333]]

2) Normalized $\tilde A$

$$\tilde A = \tilde D^{-1/2}(A+I) \tilde D^{-1/2}$$

Now it is symmetric.
(Skip the details)

from scipy.linalg import fractional_matrix_power

D_half_norm = fractional_matrix_power(D, -0.5)

print(D_half_norm)

[[0.5        0.         0.         0.        ]
 [0.         0.70710678 0.         0.        ]
 [0.         0.         0.57735027 0.        ]
 [0.         0.         0.         0.57735027]]

A_self = np.asmatrix(A_self)
D_half_norm = np.asmatrix(D_half_norm)

A_half_norm = D_half_norm*A_self*D_half_norm

print(A_half_norm)

[[0.25       0.35355339 0.28867513 0.28867513]
 [0.35355339 0.5        0.         0.        ]
 [0.28867513 0.         0.33333333 0.33333333]
 [0.28867513 0.         0.33333333 0.33333333]]

2. Graph Convolution Network (GCN)¶

2.1. Convolution¶

In previous CNN lecture CNN has two characteristics, preserving the spatial structure and weight sharing
To apply convolution in graph network, graph also has to conpensate that characteristics too

Convolution Layer

In CNN, convolution layer preserve the spatial structure of input
It convolve over all spatial locations
- Extract features for each convolution layer

Weight Sharing

Reduce the number of parameters by weight sharing
Within the same layer, the same filter will be used throughout image

2.2. Connection between CNN and GCN¶

GCNs perform similar operations where the model learns the features by inspecting neighboring nodes
The major difference between CNNs and GCNs is that CNNs are specially built to operate on regular (Euclidean) structured data, while GCNs operate for the graph data that the number of nodes connections vary and the nodes are unordered (irregular on non-Euclidean structured data)

2.3. Basics of GCN¶

Similar to CNN, GCN updates each node with their adjacent nodes
Unlike CNN, each node of GCN has different number of adjacent nodes
- Indicate adjacent nodes of each node by adjacency matrix $A$

Basic process (or terminology) of GCN
- Message: information passed by neighboring nodes to the central node
- Aggregate: collect information from neighboring nodes
- Update: embedding update by combining information from neighboring nodes and from itself

$$ \begin{align*} h_{u}^{(k+1)} &= \text{UPDATE} \left( \text{AGGREGATE} \left( \left\{ h_{v}^{(k)}, \forall v \in \mathcal{N}(u) \right\} \right) \right)\\ \end{align*} $$

1) Message Aggregation from Local Neighborhood

$$ \begin{align*} &\text{AGGREGATE} \left( \left\{ h_{v}^{(k)}, \forall v \in \mathcal{N}(u) \right\} \right)\\\\ &\Rightarrow AH^{(k)} \end{align*} $$

2) Update

Adding a non-linear function: $k^{\text{th}}$ layer

$$ \begin{align*} H^{(k+1)} &= f \left( A, H^{(k)} \right) \\ & = \sigma \left( A H^{(k)} \, W \right) \end{align*} $$

$$ \begin{align*} h_{u}^{(k+1)} &= \text{UPDATE} \left( \text{AGGREGATE} \left( \left\{ h_{v}^{(k)}, \forall v \in \mathcal{N}(u) \right\} \right) \right)\\\\ H^{(k+1)} &= \sigma \left(A H^{(k)} \, W_{\text{neigh}}^{(k)} \right) \end{align*} $$

$h_1^{(k)}$: feature matirx of first node in $k^{th}$ layer
$W^{(k)}$: weight of $k^{th}$ layer
- Weight sharing: share same weight for each layer
  - In the same layer, each node is updated similarly, so it shares the same weight
  - weight sharing enchance computing complexity and time

2.4. Further Improvements for GCN¶

1) Message Passing with Self-Loops

As a simplification of the neural message passing approach, it is common to add self-loops to the input graph and omit the explicit update step

$$ \begin{align*} h_{u}^{(k+1)} &= \text{UPDATE} \left( h_{u}^{(k)}, \text{AGGREGATE} \left( \left\{ h_{v}^{(k)}, \forall v \in \mathcal{N}(u) \right\} \right) \right) \\ &= \text{UPDATE} \left( \text{AGGREGATE} \left( \left\{ h_{v}^{(k)}, \forall v \in \mathcal{N}(u) \cup \{u \}\right\} \right) \right) \\ \\ H^{(k+1)} &= \sigma \left( \left(A+I \right)H^{(k)} \, W^{(k)}\right) \end{align*} $$

2) Neighborhood Normalization

The most basic neighborhood aggregation operation simply takes the sum of the neighbor embedding.
One issue with this approach is that it can be unstable and highly sensitive to node degrees.
One solution to this problem is to simply normalize the aggregation operation based upon the degrees of the nodes involved.
The simplest approach is to just take a weighted average rather than sum.

$$ \begin{align*} \tilde A &= D^{-1/2}AD^{-1/2} + I \\ & \approx \tilde D^{-1/2}(A+I) \tilde D^{-1/2} \qquad \text{where } \, \tilde D \, \text{ is the degree matrix of } A+I \end{align*} $$

Finally Graph Convolutional Networks

$$ \begin{align*} H^{(k+1)} &= \sigma \left(A H^{(k)} \, W^{(k)} \right) \\\\ &\Downarrow \\\\ H^{(k+1)} &= \sigma \left( \left(A+I \right)H^{(k)} \, W^{(k)}\right) \\\\ &\Downarrow \\\\ H^{(k+1)} &= \sigma \left( \left(\tilde D^{-1/2}(A+I)\tilde D^{-1/2} \right)H^{(k)} \, W^{(k)}\right)\\\\\\ \therefore H^{(k+1)} &= \sigma \left( \tilde A H^{(k)} \, W^{(k)}\right) \end{align*} $$

For each layer, the feature matrix and weight matrix are multiplied to create the next feature matrix

import networkx as nx
import matplotlib.pyplot as plt

%matplotlib inline

G = nx.Graph()

G.add_nodes_from([1, 2, 3, 4, 5, 6])
G.add_edges_from([(1, 2), (1, 3), (2, 3), (1, 4), (4, 5), (4, 6), (5, 6)])

nx.draw(G, with_labels = True, node_size = 600, font_size = 22)
plt.show()

A = nx.adjacency_matrix(G).todense()

print(A)

[[0 1 1 1 0 0]
 [1 0 1 0 0 0]
 [1 1 0 0 0 0]
 [1 0 0 0 1 1]
 [0 0 0 1 0 1]
 [0 0 0 1 1 0]]

Assign feature vector $H$, so that it can be separated into two groups

H = np.matrix([1,0,0,-1,0,0]).T

print(H)

[[ 1]
 [ 0]
 [ 0]
 [-1]
 [ 0]
 [ 0]]

Product of adjacency matrix and node features matrix represents the sum of neighboring node features

A*H

matrix([[-1],
        [ 1],
        [ 1],
        [ 1],
        [-1],
        [-1]])

A_self = A + np.eye(6)

A_self*H

matrix([[ 0.],
        [ 1.],
        [ 1.],
        [ 0.],
        [-1.],
        [-1.]])

Similar to data pre-processing for any neural networks operation, normalize the features to prevent numerical instabilities and vanishing/exploding gradients in order for the model to converge

D = np.array(A_self.sum(1)).flatten()
D = np.diag(D)

D_half_norm = fractional_matrix_power(D, -0.5)

A_self = np.asmatrix(A_self)
D_half_norm = np.asmatrix(D_half_norm)

A_half_norm = D_half_norm*A_self*D_half_norm

A_half_norm*H

matrix([[ 0.        ],
        [ 0.28867513],
        [ 0.28867513],
        [ 0.        ],
        [-0.28867513],
        [-0.28867513]])

Build 2-layer GCN using ReLU as the activation function

$$ \begin{align*} H^{(2)} &= \text{ReLU} \left( \tilde A H^{(1)} \, W^{(1)}\right) \\ H^{(3)} &= \text{ReLU} \left( \tilde A H^{(2)} \, W^{(2)}\right) \end{align*} $$

np.random.seed(20)

W1 = np.random.randn(1, 4) # input: 1 -> hidden: 4
W2 = np.random.randn(4, 2) # hidden: 4 -> output: 2

def relu(x):
    return np.maximum(0, x)

def gcn(A_self, H, W):   
    D = np.diag(np.array(A_self.sum(1)).flatten())    
    D_half_norm = fractional_matrix_power(D, -0.5)
    H_new = D_half_norm*A_self*D_half_norm*H*W
    return relu(H_new)

H1 = H
H2 = gcn(A_self, H1, W1)
H3 = gcn(A_self, H2, W2)

print(H3)

[[0.         0.07472825]
 [0.         0.08628875]
 [0.         0.08628875]
 [0.12632564 0.        ]
 [0.14586829 0.        ]
 [0.14586829 0.        ]]

2.5. Readout: Permutation Invariance¶

Adjacency matrix can be different even though two graph has the same network structure
- Even if the edge information between all nodes is the same, the order of values in the matrix may be different due to rotation and symmetry
Therefore, in graph-level representation,Readout layer makes this permutation invariant by multiplying MLP

Node-wise summation

$$ Z_G = \sigma \left(\sum_{i \in G} \text{MLP} \left(H_i^{(L)} \right) \right) $$

2.6. Overall Structure of GCN¶

Graph information with feature matrix and adjacency matrix input to GCN
Graph Convolution Layer
- Update information of each node according to their adjacency matrix

Collect all node information with MLP and determine a certain value for regression or classification in readout layer

2.7. Three Types of GNN Problem¶

Task 1: Node classification
Task 2: Edges prediction
Task 3: Graph classification

3. Lab 1: Node Classification using Graph Convolutional Networks¶

3.0. List of GNN Python Libraries¶

PyTorch Geometric
- PyG
- Built upon PyTorch

Deep Graph Library (DGL)
- Based on PyTorch, TensorFlow or Apache MXNet.

Graph Nets
- DeepMind’s library for building graph networks in Tensorflow and Sonnet

Spektral
- Based on the Keras API and TensorFlow 2
- We will use this one for demo

# !pip install spektral==0.6.0
# !pip install tensorflow==2.2.0
# !pip install keras==2.3.0

import numpy as np
import networkx as nx
import tensorflow as tf
import matplotlib.pyplot as plt

import spektral

3.1. Data Loading¶

Download data from here

CORA dataset

This dataset is the MNIST equivalent in graph learning
The CORA dataset consists of 2708 scientific publications classified into one of seven classes.
- Case_Based: 298
- Genetic_Algorithms: 418
- Neural_Networks: 818
- Probabilistic_Methods: 426
- Reinforcement_Learning: 217
- Rule_Learning: 180
- Theory: 351

The citation network consists of 5429 links.
Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary.
The dictionary consists of 1433 unique words.

nodes = np.load('./data_files/cora_nodes.npy')
edge_list = np.load('./data_files/cora_edges.npy')

labels_encoded = np.load('./data_files/cora_labels_encoded.npy')

H = np.load('./data_files/cora_features.npy')
data_mask = np.load('./data_files/cora_mask.npy')

N = H.shape[0]
F = H.shape[1]

print('H shape: ', H.shape)
print('The number of nodes (N): ', N)
print('The number of features (F) of each node: ', F)

num_classes = 7
print('The number of classes: ', num_classes)

H shape:  (2708, 1433)
The number of nodes (N):  2708
The number of features (F) of each node:  1433
The number of classes:  7

3.2 Train/Test Data Splitting¶

We split train and test dataset in ratio of 7:3, so the total number of dataset is 1895 for training, and 813 for testing.

# index of node for train model
train_mask = data_mask[0]

# index of node for test model
test_mask = data_mask[1]

print("The number of trainig data: ", np.sum(train_mask))
print("The number of test data: ", np.sum(test_mask))

The number of trainig data:  1895
The number of test data:  813

3.3 Initializing Graph G¶

G = nx.Graph(name = 'Cora')
G.add_nodes_from(nodes)
G.add_edges_from(edge_list)

print('Graph info: ', nx.info(G))

Graph info:  Graph named 'Cora' with 2708 nodes and 5278 edges

3.4 Construct and Normalize Adjacency Matrix A¶

3.4.1. Normalizing term , $\tilde D^{-1/2} (A+I) \tilde D^{-1/2}$¶

from scipy.linalg import fractional_matrix_power

A = nx.adjacency_matrix(G)

I = np.eye(A.shape[-1])
A_self = A + I

D = np.diag(np.array(A_self.sum(1)).flatten())
D_half_norm = fractional_matrix_power(D, -0.5)
    
A_half_norm = D_half_norm * A_self * D_half_norm

A_half_norm = np.array(A_half_norm)
H = np.array(H)

3.5 GCN Model¶

H_in = tf.keras.layers.Input(shape = (F, ))
A_in = tf.keras.layers.Input(shape = (N, ))

graph_conv_1 = spektral.layers.GraphConv(channels = 16,
                                         activation = 'relu')([H_in, A_in])

graph_conv_2 = spektral.layers.GraphConv(channels = 7,
                                         activation = 'softmax')([graph_conv_1, A_in])

model = tf.keras.models.Model(inputs = [H_in, A_in], outputs = graph_conv_2)

model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 1e-2),
              loss = 'categorical_crossentropy',
              weighted_metrics = ['acc'])

model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 1433)]       0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 2708)]       0                                            
__________________________________________________________________________________________________
graph_conv (GraphConv)          (None, 16)           22944       input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
graph_conv_1 (GraphConv)        (None, 7)            119         graph_conv[0][0]                 
                                                                 input_2[0][0]                    
==================================================================================================
Total params: 23,063
Trainable params: 23,063
Non-trainable params: 0
__________________________________________________________________________________________________

3.6 Train Model¶

model.fit([H, A_half_norm],
          labels_encoded,
          sample_weight = train_mask,
          epochs = 30,
          batch_size = N,
          shuffle = False)

Epoch 1/30
1/1 [==============================] - 0s 997us/step - loss: 1.3631 - acc: 0.1430
Epoch 2/30
1/1 [==============================] - 0s 994us/step - loss: 1.2854 - acc: 0.4153
Epoch 3/30
1/1 [==============================] - 0s 0s/step - loss: 1.1984 - acc: 0.5894
Epoch 4/30
1/1 [==============================] - 0s 998us/step - loss: 1.1052 - acc: 0.6570
Epoch 5/30
1/1 [==============================] - 0s 998us/step - loss: 1.0199 - acc: 0.6902
Epoch 6/30
1/1 [==============================] - 0s 1ms/step - loss: 0.9398 - acc: 0.7103
Epoch 7/30
1/1 [==============================] - 0s 998us/step - loss: 0.8625 - acc: 0.7193
Epoch 8/30
1/1 [==============================] - 0s 0s/step - loss: 0.7886 - acc: 0.7303
Epoch 9/30
1/1 [==============================] - 0s 998us/step - loss: 0.7194 - acc: 0.7499
Epoch 10/30
1/1 [==============================] - 0s 997us/step - loss: 0.6549 - acc: 0.7683
Epoch 11/30
1/1 [==============================] - 0s 997us/step - loss: 0.5948 - acc: 0.7963
Epoch 12/30
1/1 [==============================] - 0s 997us/step - loss: 0.5394 - acc: 0.8301
Epoch 13/30
1/1 [==============================] - 0s 998us/step - loss: 0.4892 - acc: 0.8586
Epoch 14/30
1/1 [==============================] - 0s 997us/step - loss: 0.4444 - acc: 0.8786
Epoch 15/30
1/1 [==============================] - 0s 987us/step - loss: 0.4049 - acc: 0.8860
Epoch 16/30
1/1 [==============================] - 0s 997us/step - loss: 0.3703 - acc: 0.8950
Epoch 17/30
1/1 [==============================] - 0s 996us/step - loss: 0.3398 - acc: 0.8992
Epoch 18/30
1/1 [==============================] - 0s 998us/step - loss: 0.3131 - acc: 0.9018
Epoch 19/30
1/1 [==============================] - 0s 0s/step - loss: 0.2898 - acc: 0.9055
Epoch 20/30
1/1 [==============================] - 0s 0s/step - loss: 0.2696 - acc: 0.9119
Epoch 21/30
1/1 [==============================] - 0s 0s/step - loss: 0.2521 - acc: 0.9161
Epoch 22/30
1/1 [==============================] - 0s 998us/step - loss: 0.2369 - acc: 0.9187
Epoch 23/30
1/1 [==============================] - 0s 0s/step - loss: 0.2235 - acc: 0.9193
Epoch 24/30
1/1 [==============================] - 0s 997us/step - loss: 0.2118 - acc: 0.9203
Epoch 25/30
1/1 [==============================] - 0s 997us/step - loss: 0.2015 - acc: 0.9208
Epoch 26/30
1/1 [==============================] - 0s 0s/step - loss: 0.1923 - acc: 0.9219
Epoch 27/30
1/1 [==============================] - 0s 998us/step - loss: 0.1841 - acc: 0.9214
Epoch 28/30
1/1 [==============================] - 0s 0s/step - loss: 0.1767 - acc: 0.9230
Epoch 29/30
1/1 [==============================] - 0s 0s/step - loss: 0.1701 - acc: 0.9240
Epoch 30/30
1/1 [==============================] - 0s 997us/step - loss: 0.1640 - acc: 0.9266

<tensorflow.python.keras.callbacks.History at 0x18edc562708>

3.7 Model Evaluation¶

y_pred = model.evaluate([H, A_half_norm],
                        labels_encoded,
                        sample_weight = test_mask,
                        batch_size = N)

1/1 [==============================] - 0s 997us/step - loss: 0.0963 - acc: 0.8954

3.8 T-SNE¶

from sklearn.manifold import TSNE

layer_outputs = [layer.output for layer in model.layers]
activation_model = tf.keras.models.Model(inputs = model.input, outputs = layer_outputs)
activations = activation_model.predict([H,A_half_norm],batch_size = N)

x_tsne = TSNE(n_components = 2).fit_transform(activations[2])

def plot_tSNE(labels_encoded,x_tsne):
    color_map = np.argmax(labels_encoded, axis = 1)
    plt.figure(figsize = (10,10))
    for cl in range(num_classes):
        indices = np.where(color_map == cl)
        indices = indices[0]
        plt.scatter(x_tsne[indices,0], x_tsne[indices, 1], label = cl)
    plt.legend()
    plt.show()
    
plot_tSNE(labels_encoded,x_tsne)

4. Useful Resources for Further Study ¶

%%html 
<center><iframe src="https://www.youtube.com/embed/fOctJB4kVlM?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/ABCGCf8cJOE?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/0YLZXjMHA-8?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/ex2qllcVneY?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/YL1jGgcY78U?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/8owQBFAHw7E?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

%%html 
<center><iframe src="https://www.youtube.com/embed/R67-JxtOQzg?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

University cources:

COMP766 at McGill (https://cs.mcgill.ca/~wlh/comp766/)
Alelab Alelab (https://www.youtube.com/channel/UC_YPrqpiEqkeGOG1TCt0giQ/playlists)

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')