Markov Chain


By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Table of Contents

1. Markov Process (or Chain)

1.1. Sequential Processes

  • Most classifiers ignored the sequential aspects of data

  • Consider a system which can occupy one of NN discrete states or categories qt{S1,S2,,SN}qt{S1,S2,,SN}

  • We are interested in stochastic systems, in which state evolution is random

  • Any joint distribution can be factored into a series of conditional distributions


p(q0,q1,,qT)=p(q0)p(q1q0)p(q2q1q0)p(q3q2q1q0)p(q0,q1,,qT)=p(q0)p(q1q0)p(q2q1q0)p(q3q2q1q0)
Amost impossible to compute !!

p(q0,q1,,qT)=p(q0)p(q1q0)p(q2q1)p(q3q2)p(q0,q1,,qT)=p(q0)p(q1q0)p(q2q1)p(q3q2)
Possible and tractable !!

1.2. Markov Process

  • (Assumption) for a Markov process, the next state depends only on the current state:

p(qt+1qt,,q0)=p(qt+1qt)p(qt+1qt,,q0)=p(qt+1qt)


  • More clearly

p(qt+1=sjqt=si)=p(qt+1=sjqt=si,any earlier history)p(qt+1=sjqt=si)=p(qt+1=sjqt=si,any earlier history)


  • Given current state, the past does not matter
  • The state captures all relevant information from the history
  • The state is a sufficient statistic of the future

1.3. State Transition Matrix

  • For a Markov state 𝑠 and successor state 𝑠′, the state transition probability is defined by

Pss=P[St+1=sSt=s]Pss=P[St+1=sSt=s]


  • State transition matrix PP defines transition probabilities from all states ss to all successor states ss.

No description has been provided for this image


No description has been provided for this image

Example: Markov chhain episodes

  • sample episodes starting from S1S1
In [1]:
import numpy as np

P = [[0, 0, 1],
    [1/2, 1/2, 0],
    [1/3, 2/3, 0]]
In [2]:
print(P[1][:])
[0.5, 0.5, 0]
In [3]:
a = np.random.choice(3,1,p = P[1][:])
print(a)
[0]
In [4]:
# sequential processes
# sequence generated by Markov chain
# S1 = 0, S2 = 1, S3 = 2

# starting from 0
x = 0
S = []
S.append(x)

for i in range(50):    
    x = np.random.choice(3,1,p = P[x][:])[0]    
    S.append(x)
     
print(S)    
[0, 2, 1, 1, 1, 0, 2, 1, 1, 0, 2, 1, 1, 1, 0, 2, 1, 0, 2, 1, 1, 1, 1, 1, 0, 2, 0, 2, 1, 0, 2, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 1, 1, 0, 2, 0, 2, 1, 0, 2, 1]

1.4. Markov Chain Components

It represents passive stochastic behavior

1) a finite set of NN states, S={S1,,SN}S={S1,,SN}

2) a state transition probability, P={pij}M×M,1i,jMP={pij}M×M,1i,jM

3) an initial state probability distribution, π={πi}π={πi}


Example of Markov Chain

  • Starting from S1S1 = Class 1
No description has been provided for this image
In [5]:
P = [[0, 0.5, 0, 0, 0, 0.5, 0],
    [0, 0, 0.8, 0, 0, 0, 0.2],
    [0, 0, 0, 0.6, 0.4, 0, 0],
    [0, 0, 0, 0, 0, 0, 1],
    [0.2, 0.4, 0.4, 0, 0, 0, 0],
    [0.1, 0, 0, 0, 0, 0.9, 0],
    [0, 0, 0, 0, 0, 0, 1]]

# sequential processes
# sequence generated by Markov chain
# [C1 C2 C3 Pass Pub FB Sleep] = [0 1 2 3 4 5 6]

name = ['C1','C2','C3','Pass','Pub','FB','Sleep']

# starting from 0
x = 0 
S = []
S.append(x)

for i in range(5):
    x = np.random.choice(len(P),1,p = P[x][:])[0]    
    S.append(x)
    
print(S)    
[0, 1, 2, 4, 1, 2]
In [6]:
episode = []

for i in S:
    episode.append(name[i])

print(episode)    
['C1', 'C2', 'C3', 'Pub', 'C2', 'C3']

2. Chapman-Kolmogorov Equation

  • (1-step transition probabilities) For a Markov chain on a finite state space, S={S1,,SN}S={S1,,SN}, with transition probability matrix PP and initial distribution π={π0i}π={π0i} (row vector) then the distribution of X(1)X(1) is given by
No description has been provided for this image
  • (2-step transition probabilities) For a Markov chain on a finite state space, S={S1,,SN}S={S1,,SN}, with transition probability matrix PP and initial distribution π={π0i}π={π0i} (row vector) then the distribution of X(2)X(2) is given by

No description has been provided for this image
  • (n-step transition probabilities) For a Markov chain on a finite state space, S={S1,,SN}S={S1,,SN}, with transition probability matrix PP and initial distribution π={π0i}π={π0i} (row vector) then the distribution of X(n)X(n) is given by

No description has been provided for this image
  • PnPn: n-step transition probabilities

  • Key recursion

pij(n)=Nk=1pik(n1)pkj(1),ikandkjimplyijpij(n)=Nk=1pik(n1)pkj(1),ikandkjimplyij


No description has been provided for this image
No description has been provided for this image
In [7]:
# state probability distribution

P = [[0, 0, 1],
    [1/2, 1/2, 0],
    [1/3, 2/3, 0]]

u = [0, 1, 0]

P = np.asmatrix(P)
u = np.asmatrix(u)

for i in range(10):
    u = u*P
    print(u) 
[[0.5 0.5 0. ]]
[[0.25 0.25 0.5 ]]
[[0.29166667 0.45833333 0.25      ]]
[[0.3125     0.39583333 0.29166667]]
[[0.29513889 0.39236111 0.3125    ]]
[[0.30034722 0.40451389 0.29513889]]
[[0.30063657 0.3990162  0.30034722]]
[[0.29962384 0.39973958 0.30063657]]
[[0.30008198 0.40029417 0.29962384]]
[[0.3000217  0.39989632 0.30008198]]
In [8]:
u = [0, 1, 0]
u = u*P**10
print(u)
[[0.3000217  0.39989632 0.30008198]]

3. Stationary Distribution

  • Steady-state behavior

  • Does pij(n)=P[Xn=jX0=i]pij(n)=P[Xn=jX0=i] converge to some πjπj ?

  • Take the limit as nn

pij(n)=Nk=1pik(n1)pkjpij(n)=Nk=1pik(n1)pkj

πj=Nk=1πkpkjπj=Nk=1πkpkj

  • Need also jπj=1jπj=1

π=πPπ=πP

In [9]:
# eigenvalue = 1 and associated eigenvector

d, v = np.linalg.eig(P.T)

print(d) # loof for eigenvalue = 1
[-0.25+0.32274861j -0.25-0.32274861j  1.  +0.j        ]
In [10]:
print(v[:,2]/np.sum(v[:,2]))
[[0.3-0.j]
 [0.4-0.j]
 [0.3-0.j]]
In [11]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')