r:markovchain [Applied Statistics and Data Mining]

markovchain

markovchainパッケージは、離散状態離散時間マルコフ過程(通称は離散マルコフ過程、あるいはマルコフ連鎖)をRで扱うのに便利な機能を提供する。以下の説明は、https://cran.r-project.org/web/packages/markovchain/vignettes/an_introduction_to_markovchain_package.pdfに基づいている。

このパッケージはR 3.5.0以上を必要とする。各自でダウンロードしてインストールする必要があるかもしれない。

マルコフ解析

マルコフ過程に基づいて現象の性質を明らかにしたり、未来の時点のおける状態の確率分布を推定することなどを、マルコフ解析という。

マルコフ解析に必要なもの

マルコフ解析に必要なものは、現在の状態もしくは初期の状態と、対象の状態遷移行列である。

状態
状態遷移行列

天気の移り変わりを例に説明を進める。

weatherStates <- c("sunny", "cloudy", "rain")

byRow <- TRUE

 weatherMatrix <- matrix(data = c(0.70, 0.2, 0.1,
 0.3, 0.4, 0.3,
 0.2, 0.45, 0.35), byrow = byRow, nrow = 3,
 dimnames = list(weatherStates, weatherStates))

mcWeather <- new("markovchain", states = weatherStates, byrow = byRow,
 transitionMatrix = weatherMatrix, name = "Weather")

markovchainのメソッド

Method	Purpose
*	Direct multiplication for transition matrices.
[	Direct access to the elements of the transition matrix.
==	Equality operator between two transition matrices.
as	Operator to convert markovchain objects into data.frame and table object.
dim	Dimension of the transition matrix.
names	Equal to states.
names←	Change the states name.
name	Get the name of markovchain object.
name←	Change the name of markovchain object.
plot	plot method for markovchain objects.
print	print method for markovchain objects.
show	show method for markovchain objects.
sort	sort method for markovchain objects.
states	Name of the transition states.
t	Transposition operator (which switches byrow slot value and modifies the transition matrix coherently).

markovchainの使い方の基本

上で定義したマルコフ連鎖を確認しておく。

weatherStates
weatherMatrix
mcWeather

さて、初期状態は曇りcloudy。

initialState = c(0, 1, 0)

二日後の天気の確率分布を求めてみる。

after2Days <- initialState * (mcWeather * mcWeather)

求めた確率分布を確認する。

after2Days

７日後の天気の条件付き確率分布を推定する。

after7Days <- initialState * (mcWeather ^ 7)

求めた確率分布を確認する。

after7Days

こんなに長い桁は必要ないので、四捨五入して小数点以下第３位まで表示してみる。

round(after7Days, 3)

状態を列ベクトルで表すなら、すべてを転地すればいい。

initialState <- c(0, 1, 0)
after2Days <- (t(mcWeather) * t(mcWeather)) * initialState
after7Days <- (t(mcWeather) ^ 7) * initialState
after2Days

fvals<-function(mchain,initialstate,n) {
out<-data.frame()
names(initialstate)<-names(mchain)
for (i in 0:n)
{
iteration<-initialstate*mchain^(i)
out<-rbind(out,iteration)
}
out<-cbind(out, i=seq(0,n))
out<-out[,c(4,1:3)]
return(out)
}
fvals(mchain=mcWeather,initialstate=c(90,5,5),n=4)

関数statesはマルコフ連鎖のオブジェクトmcWeatherの中の状態の一覧を表示させる。

states(mcWeather)

関数namesもstatesと同じ動作をする。

names(mcWeather)

関数dimは状態数を返す。

dim(mcWeather)

関数nameはマルコフ連鎖の名前を返す。これはnewで作成するときにつけている。

name(mcWeather)

markovchain:::sort(mcWeather)

transitionProbability(mcWeather, "cloudy", "rain")

mcWeather[2,3]

print(mcWeather)

show(mcWeather)

plot(mcWeather, package="diagram",box.size = 0.04)

mcDf <- as(mcWeather, "data.frame")
mcNew <- as(mcDf, "markovchain")
mcDf

mcIgraph <- as(mcWeather, "igraph")

require(msm)
Q <- rbind ( c(0, 0.25, 0, 0.25),
c(0.166, 0, 0.166, 0.166),
c(0, 0.25, 0, 0.25),
c(0, 0, 0, 0) )
cavmsm <- msm(state ~ years, subject = PTNUM, data = cav, qmatrix = Q, death = 4)
msmMc <- as(cavmsm, "markovchain")
msmMc

myMatr<-matrix(c(.1,.8,.1,.2,.6,.2,.3,.4,.3), byrow=TRUE, ncol=3)
myMc<-as(myMatr, "markovchain")
myMc

マルコフ連鎖の確率推論

Method	Returns
absorbingStates	the absorbing states of the transition matrix, if any.
steadyStates	the vector(s) of steady state(s) in matrix form.
communicatingClasses	list of communicating classes. sj , given actual state si.
canonicForm	the transition matrix into canonic form.
is.accessible	verification if a state j is reachable from state i.
is.irreducible	verification whether a DTMC is irreducible.
period	the period of an irreducible DTMC.
recurrentClasses	list of recurrent classes.
steadyStates	the vector(s) of steady state(s) in matrix form.
summary	DTMC summary.
transientStates	the transient states of the transition matrix, if any.

conditionalDistribution(mcWeather, "sunny")

steadyStates(mcWeather)

absorbingStates(mcWeather)

summary(mcWeather)

transientStates(mcWeather)

is.accessible(object = mcWeather, from = "sunny", to = "cloudy")

period(mcWeather)

firstPassage(object = mcWeather, state = "sunny",n = 10)

マルコフ連鎖のための統計的推測

Function	Purpose
markovchainFit	Function to return fitted Markov chain for a given sequence.
predict	Method to calculate predictions from markovchain or markovchainList objects.
rmarkovchain	Function to sample from markovchain or markovchainList objects.

状態遷移のデータがあるとする。

Alofi島の１日あたりの降雨のデータがある。

data(rain)

このデータは、２つのフィールドV1とrainを持つ。使うのは rain の方。一日ごとの推移を集計してみる。

table(data.frame(before=rain$rain[-1], after=rain$rain[-length(rain$rain)]))

出力は次の通り。

      after
before   0 1-5  6+
   0   362 136  50
   1-5 126  90  79
   6+   60  68 124

この推移をグラフに表してみる。Gmiscパッケージの中の関数transitionPlotを使うために、準備をする。

install.packages(c("rms","sandwich","stringr","Hmisc"))
reps = c("http://ftp.sunet.se/pub/lang/CRAN","http://cran.gforge.se")
install.packages("Gmisc", repos=reps, dependencies=TRUE, type="source")

そして、使ってみる。

require(Gmisc)
transitionPlot(table(data.frame(r1=rain$rain[-1],r2=rain$rain[-length(rain$rain)])),
  overlap_add_width=1.2,type_of_arrow="gradient",
  min_lwd = unit(2, "mm"), max_lwd = unit(15, "mm"),cex=1.5)

library(dplyr)
library(stringr)
library(DiagrammeR)
library(networkD3)

この準備で頑張ると、こんなグラフも描けるらしい。

この推移から、遷移行列を推定する。

markovchainFit(rain$rain, method="mle", byrow=TRUE)

次の推定値などが表示される。

$estimate
            0       1-5        6+
0   0.6605839 0.2299270 0.1094891
1-5 0.4625850 0.3061224 0.2312925
6+  0.1976285 0.3122530 0.4901186


$standardError
             0        1-5         6+
0   0.03471952 0.02048353 0.01413498
1-5 0.03966634 0.03226814 0.02804834
6+  0.02794888 0.03513120 0.04401395

$confidenceLevel
[1] 0.95

$lowerEndpointMatrix
            0       1-5         6+
0   0.6034754 0.1962346 0.08623909
1-5 0.3973397 0.2530461 0.18515711
6+  0.1516566 0.2544673 0.41772208

$upperEndpointMatrix
            0       1-5        6+
0   0.7176925 0.2636194 0.1327390
1-5 0.5278304 0.3591988 0.2774279
6+  0.2436003 0.3700387 0.5625151

$logLikelihood
[1] -1040.419

rain.MCに代入しておく。

rain.mcfit <- markovchainFit(rain$rain, method="mle", byrow=TRUE)

返して貰った内容を見てみる。

str(rain.mcfit)

rain.MCはS3オブジェクトのリストだが、rain.MC $estimateはS4オブジェクトとややこしい。遷移行列を行列として取り出すには、次のようにリストの名前とその中のスロットを指定する。 <code> rain.mcfit$ estimate@transitionMatrix </code>

これによって新たなマルコフ連鎖を定義するには、次のように書く。

mcrain = new("markovchain", states = rain.mcfit$estimate@states,
      byrow = TRUE,
      transitionMatrix = rain.mcfit$estimate@transitionMatrix, name = "rain")

関数markovchainFitは、markovchainオブジェクトをリストの要素として出力しているので、そちらを使うのも良い。

mcrain = rain.mcfit$estimate

rain.mcfit <- markovchainFit(rain$rain, method="laplace", laplacian=0.5, byrow=TRUE)

Applied Statistics and Data Mining

サイト用ツール

ユーザ用ツール

目次

markovchain

マルコフ解析

マルコフ解析に必要なもの

markovchainのメソッド

markovchainの使い方の基本

マルコフ連鎖の確率推論

マルコフ連鎖のための統計的推測

サイト用ツール

ページ用ツール

ユーザ用ツール

目次

markovchain

マルコフ解析

マルコフ解析に必要なもの

markovchainのメソッド

markovchainの使い方の基本

マルコフ連鎖の確率推論

マルコフ連鎖のための統計的推測