差分

このページの2つのバージョン間の差分を表示します。

--- r:markovchain [2018/12/11 23:14] – created watalu
+++ r:markovchain [2018/12/17 11:47] (現在) – watalu
@@ 行 1: / 行 1: @@
-==== マルコフ性とマルコフ過程 ====
+==== markovchain ====
-時間をtで表し、時間の経過とともに変化する状態X[t]を考える。
+markovchainパッケージは、離散状態離散時間マルコフ過程(通称は離散マルコフ過程、あるいはマルコフ連鎖)をRで扱うのに便利な機能を提供する。以下の説明は、[[https://cran.r-project.org/web/packages/markovchain/vignettes/an_introduction_to_markovchain_package.pdf]]に基づいている。
-時間の間隔は等間隔とし、整数で表す。
-X[0], X[1], ..., X[t-1],X[t],X[t+1],...は時間の添え字を持つ確率変数の列である。
+  * [[https://dataconomy.com/2018/03/an-introduction-to-markov-chains-using-r/|An introduction to Markov chains using R]]
-この確率変数の列は(離散時間)確率過程と呼ばれる。
+  * [[https://www.datacamp.com/community/tutorials/markov-chain-analysis-r|Markov chain analysis with R]]
-任意の時点tより昔の確率変数の実現値の列x[0], x[1], ..., x[t-1]を時点tより前の履歴(history)といい、H[t-]と記す。
+  * [[https://cran.r-project.org/web/packages/markovchain/vignettes/markovchainCrashIntro.pdf|Crash introduction ]]
+  * [[https://cran.r-project.org/web/packages/markovchain/vignettes/an_introduction_to_markovchain_package.pdf|An introduction]]
-任意の時点tの状態X[t]の、それ以前のすべての時点の履歴を与えた条件付き確率分布Pr[X[t]=i|H[t-]] = Pr[X[t]=i|X[t-1]=x[t-1],X[t-2]=x[t-2],X[t-3]=x[t-3],...]が、直前の１時点のみを与えて定まる、すなわちPr[X[t]=i|H[t-]] = Pr[X[t]=i|X[t-1]=x[t-1]]を満たす時、その確率過程はマルコフ性を持つという。
+このパッケージはR 3.5.0以上を必要とする。各自でダウンロードしてインストールする必要があるかもしれない。
-マルコフ性を持つ確率過程をマルコフ過程という。
+=== マルコフ解析 ===
-以上ではX[t]が離散確率変数の場合を紹介したが、連続確率変数でも条件付き確率密度関数を用いて、同様の議論が展開できる。
+マルコフ過程に基づいて現象の性質を明らかにしたり、未来の時点のおける状態の確率分布を推定することなどを、マルコフ解析という。
-それを連続状態離散時間マルコフ過程という。
-現在の時点をt、少し先の未来の時点をt+1と置くと、
+=== マルコフ解析に必要なもの ===
-Pr[X[t+1]=i|H[t]] = Pr[X[t+1]=i|X[t]=x[t]]
-となり、未来を予測するのに、過去のすべてのデータを参照する必要はなく、現在の状態x[t]が必要なすべての情報である、ということにマルコフ性は等しい。
+マルコフ解析に必要なものは、現在の状態もしくは初期の状態と、対象の状態遷移行列である。
+  * 状態
+  * 状態遷移行列
+天気の移り変わりを例に説明を進める。
+<code>
+weatherStates <- c("sunny", "cloudy", "rain")
+</code>
+<code>
+byRow <- TRUE
+</code>
+<code>
+ weatherMatrix <- matrix(data = c(0.70, 0.2, 0.1,
+.3, 0.4, 0.3,
+.2, 0.45, 0.35), byrow = byRow, nrow = 3,
+ dimnames = list(weatherStates, weatherStates))
+</code>
+<code>
+mcWeather <- new("markovchain", states = weatherStates, byrow = byRow,
+ transitionMatrix = weatherMatrix, name = "Weather")
+</code>
+=== markovchainのメソッド ===
+|Method|Purpose|
+|*|Direct multiplication for transition matrices.|
+|[|Direct access to the elements of the transition matrix.|
+|==|Equality operator between two transition matrices.|
+|as|Operator to convert markovchain objects into data.frame and table object.|
+|dim|Dimension of the transition matrix.|
+|names|Equal to states.|
+|names<-|Change the states name.|
+|name|Get the name of markovchain object.|
+|name<-|Change the name of markovchain object.|
+|plot|plot method for markovchain objects.|
+|print|print method for markovchain objects.|
+|show|show method for markovchain objects.|
+|sort|sort method for markovchain objects.|
+|states|Name of the transition states.|
+|t|Transposition operator (which switches byrow slot value and modifies the transition matrix coherently).|
+=== markovchainの使い方の基本 ===
+上で定義したマルコフ連鎖を確認しておく。
+<code>
+weatherStates
+weatherMatrix
+mcWeather
+</code>
+さて、初期状態は曇りcloudy。
+<code>
+initialState = c(0, 1, 0)
+</code>
+二日後の天気の確率分布を求めてみる。
+<code>
+after2Days <- initialState * (mcWeather * mcWeather)
+</code>
+求めた確率分布を確認する。
+<code>
+after2Days
+</code>
+７日後の天気の条件付き確率分布を推定する。
+<code>
+after7Days <- initialState * (mcWeather ^ 7)
+</code>
+求めた確率分布を確認する。
+<code>
+after7Days
+</code>
+こんなに長い桁は必要ないので、四捨五入して小数点以下第３位まで表示してみる。
+<code>
+round(after7Days, 3)
+</code>
+状態を列ベクトルで表すなら、すべてを転地すればいい。
+<code>
+initialState <- c(0, 1, 0)
+after2Days <- (t(mcWeather) * t(mcWeather)) * initialState
+after7Days <- (t(mcWeather) ^ 7) * initialState
+after2Days
+</code>
+<code>
+fvals<-function(mchain,initialstate,n) {
+out<-data.frame()
+names(initialstate)<-names(mchain)
+for (i in 0:n)
+{
+iteration<-initialstate*mchain^(i)
+out<-rbind(out,iteration)
+}
+out<-cbind(out, i=seq(0,n))
+out<-out[,c(4,1:3)]
+return(out)
+}
+fvals(mchain=mcWeather,initialstate=c(90,5,5),n=4)
+</code>
+関数statesはマルコフ連鎖のオブジェクトmcWeatherの中の状態の一覧を表示させる。
+<code>
+states(mcWeather)
+</code>
+関数namesもstatesと同じ動作をする。
+<code>
+names(mcWeather)
+</code>
+関数dimは状態数を返す。
+<code>
+dim(mcWeather)
+</code>
+関数nameはマルコフ連鎖の名前を返す。これはnewで作成するときにつけている。
+<code>
+name(mcWeather)
+</code>
+<code>
+markovchain:::sort(mcWeather)
+</code>
+<code>
+transitionProbability(mcWeather, "cloudy", "rain")
+</code>
+<code>
+mcWeather[2,3]
+</code>
+<code>
+print(mcWeather)
+</code>
+<code>
+show(mcWeather)
+</code>
+<code>
+plot(mcWeather, package="diagram",box.size = 0.04)
+</code>
+<code>
+mcDf <- as(mcWeather, "data.frame")
+mcNew <- as(mcDf, "markovchain")
+mcDf
+</code>
+<code>
+mcIgraph <- as(mcWeather, "igraph")
+</code>
+<code>
+require(msm)
+Q <- rbind ( c(0, 0.25, 0, 0.25),
+c(0.166, 0, 0.166, 0.166),
+c(0, 0.25, 0, 0.25),
+c(0, 0, 0, 0) )
+cavmsm <- msm(state ~ years, subject = PTNUM, data = cav, qmatrix = Q, death = 4)
+msmMc <- as(cavmsm, "markovchain")
+msmMc
+</code>
+<code>
+myMatr<-matrix(c(.1,.8,.1,.2,.6,.2,.3,.4,.3), byrow=TRUE, ncol=3)
+myMc<-as(myMatr, "markovchain")
+myMc
+</code>
+=== マルコフ連鎖の確率推論 ===
+|Method|Returns|
+|absorbingStates|the absorbing states of the transition matrix, if any.|
+|steadyStates|the vector(s) of steady state(s) in matrix form.|
+|communicatingClasses|list of communicating classes. sj , given actual state si.|
+|canonicForm|the transition matrix into canonic form.|
+|is.accessible|verification if a state j is reachable from state i.|
+|is.irreducible|verification whether a DTMC is irreducible.|
+|period|the period of an irreducible DTMC.|
+|recurrentClasses|list of recurrent classes.|
+|steadyStates|the vector(s) of steady state(s) in matrix form.|
+|summary|DTMC summary.|
+|transientStates|the transient states of the transition matrix, if any.|
+<code>
+conditionalDistribution(mcWeather, "sunny")
+</code>
+<code>
+steadyStates(mcWeather)
+</code>
+<code>
+absorbingStates(mcWeather)
+</code>
+<code>
+summary(mcWeather)
+</code>
+<code>
+transientStates(mcWeather)
+</code>
+<code>
+is.accessible(object = mcWeather, from = "sunny", to = "cloudy")
+</code>
+<code>
+period(mcWeather)
+</code>
+<code>
+firstPassage(object = mcWeather, state = "sunny",n = 10)
+</code>
+=== マルコフ連鎖のための統計的推測 ===
+|Function|Purpose|
+|markovchainFit|Function to return fitted Markov chain for a given sequence.|
+|predict|Method to calculate predictions from markovchain or markovchainList objects.|
+|rmarkovchain|Function to sample from markovchain or markovchainList objects.|
+状態遷移のデータがあるとする。
+Alofi島の１日あたりの降雨のデータがある。
+<code>
+data(rain)
+</code>
+このデータは、２つのフィールドV1とrainを持つ。使うのは rain の方。
+一日ごとの推移を集計してみる。
+<code>
+table(data.frame(before=rain$rain[-1], after=rain$rain[-length(rain$rain)]))
+</code>
+出力は次の通り。
+<code>
+      after
+before   0 1-5  6+
+   362 136  50
+-5 126  90  79
++   60  68 124
+</code>
+この推移をグラフに表してみる。Gmiscパッケージの中の関数transitionPlotを使うために、準備をする。
+<code>
+install.packages(c("rms","sandwich","stringr","Hmisc"))
+reps = c("http://ftp.sunet.se/pub/lang/CRAN","http://cran.gforge.se")
+install.packages("Gmisc", repos=reps, dependencies=TRUE, type="source")
+</code>
+そして、使ってみる。
+<code>
+require(Gmisc)
+transitionPlot(table(data.frame(r1=rain$rain[-1],r2=rain$rain[-length(rain$rain)])),
+  overlap_add_width=1.2,type_of_arrow="gradient",
+  min_lwd = unit(2, "mm"), max_lwd = unit(15, "mm"),cex=1.5)
+</code>
+{{:r:rain-transitionplot.png|}}
+<code>
+library(dplyr)
+library(stringr)
+library(DiagrammeR)
+library(networkD3)
+</code>
+この準備で頑張ると、[[https://kazutan.github.io/kazutanR/koneta_state_trans_viz.html|こんな]]グラフも描けるらしい。
+この推移から、遷移行列を推定する。
+<code>
+markovchainFit(rain$rain, method="mle", byrow=TRUE)
+</code>
+次の推定値などが表示される。
+<code>
+$estimate
+       1-5        6+
+   0.6605839 0.2299270 0.1094891
+-5 0.4625850 0.3061224 0.2312925
++  0.1976285 0.3122530 0.4901186
+$standardError
+        1-5         6+
+   0.03471952 0.02048353 0.01413498
+-5 0.03966634 0.03226814 0.02804834
++  0.02794888 0.03513120 0.04401395
+$confidenceLevel
+[1] 0.95
+$lowerEndpointMatrix
+       1-5         6+
+   0.6034754 0.1962346 0.08623909
+-5 0.3973397 0.2530461 0.18515711
++  0.1516566 0.2544673 0.41772208
+$upperEndpointMatrix
+       1-5        6+
+   0.7176925 0.2636194 0.1327390
+-5 0.5278304 0.3591988 0.2774279
++  0.2436003 0.3700387 0.5625151
+$logLikelihood
+[1] -1040.419
+</code>
+rain.MCに代入しておく。
+<code>
+rain.mcfit <- markovchainFit(rain$rain, method="mle", byrow=TRUE)
+</code>
+返して貰った内容を見てみる。
+<code>
+str(rain.mcfit)
+</code>
+rain.MCはS3オブジェクトのリストだが、rain.MC$estimateはS4オブジェクトとややこしい。
+遷移行列を行列として取り出すには、次のようにリストの名前とその中のスロットを指定する。
+<code>
+rain.mcfit$estimate@transitionMatrix
+</code>
+これによって新たなマルコフ連鎖を定義するには、次のように書く。
+<code>
+mcrain = new("markovchain", states = rain.mcfit$estimate@states,
+      byrow = TRUE,
+      transitionMatrix = rain.mcfit$estimate@transitionMatrix, name = "rain")
+</code>
+関数markovchainFitは、markovchainオブジェクトをリストの要素として出力しているので、そちらを使うのも良い。
+<code>
+mcrain = rain.mcfit$estimate
+</code>
+<code>
+rain.mcfit <- markovchainFit(rain$rain, method="laplace", laplacian=0.5, byrow=TRUE)
+</code>
-なお、互いに独立な確率変数の列もマルコフ性は満たす。

Applied Statistics and Data Mining

サイト用ツール

ユーザ用ツール

差分

サイト用ツール

ページ用ツール

ユーザ用ツール

差分