差分

このページの2つのバージョン間の差分を表示します。

--- r:survival [2018/12/09 18:36] – watalu
+++ r:survival [2018/12/11 19:42] (現在) – watalu
@@ 行 1: / 行 1: @@
 ==== Rによる生存時間解析 ====
+[[r:survival:code]]の方に記したコードを、先にRに貼り付けて読み込ませておくこと。いくつかのグラフを描く関数を用意しておいたものである。参考までに、用いているRの関数を個別に詳細に説明したページも用意している。
+  * [[r:survival:data.frame]]は寿命データの記し方を説明している
+  * [[r:mass:fitdistr]]は完全データからの確率分布のパラメータの最尤法に基づいて推定する
+  * [[r:survival:survfit]]は打ち切りを含む寿命データから、生存関数をノンパラメトリックに推定する
+ただし、実験の内容を実施するためには、上の３つのページは読まずとも問題なく、下の説明に従って進めれば十分である。
 === データの書き方 ===
@@ 行 89: / 行 97: @@
 <code>
 X$status
+</code>
+=== 信頼性データ図 ===
+信頼性データを図示する関数draw.reliability.data.diagramを用意した。
+上に与えたデータの図を描くには次の１行を実行する。
+<code>
+plot.reliability.data.diagram(X)
 </code>
@@ 行 99: / 行 116: @@
 </code>
-これを実行しておかないと、すぐ次のコマンドの実行時に赤い文字でエラーが表示される。
+これを実行しておかないと、すぐ次のコマンドの実行時に青い文字でエラーが表示されるかもしれない。
 === 生存曲線と故障率曲線のノンパラメトリック推定 ===
@@ 行 240: / 行 257: @@
 === 関数 ===
-完全データについて、これら三つの確率分布を推定し、ヒストグラムと比較するための関数を用意した。
+完全データについて、これら三つの確率分布を推定し、ヒストグラムと比較するための関数compare.histgram.and.densitiesを用意した。
-<code>
-draw.histgram.and.densities <- function(X,hist=TRUE) {
-  require(MASS)
-  est.weibull <- fitdistr(X$time,densfun="weibull")
-  est.lnorm <- fitdistr(X$time,densfun="lognormal")
-  est.gamma <- fitdistr(X$time,densfun="gamma")
-  if(hist==TRUE) {
-    hist(X$time,freq=FALSE,ylim=c(0,0.3),
-         xlab="Lifetime")
-    plot(function(x){return(dlnorm(x,meanlog=est.lnorm$estimate[1],
-                                   sdlog=est.lnorm$estimate[2]))},
-         xlim=c(0,12),
-         col="blue",
-         add=TRUE)
-  } else {
-    plot(function(x){return(dlnorm(x,meanlog=est.lnorm$estimate[1],
-                                   sdlog=est.lnorm$estimate[2]))},
-         xlim=c(0,12),
-         col="blue",xlab="Lifetime",ylab="Density")
-  }
-  plot(function(x){return(dweibull(x,shape=est.weibull$estimate[1],
-                                   scale=est.weibull$estimate[2]))},
-       xlim=c(0,12),
-       col="red",
-       add=TRUE)
-  plot(function(x){return(dgamma(x,shape=est.gamma$estimate[1],
-                                 rate=est.gamma$estimate[2]))},
-       xlim=c(0,12),
-       col="green",
-       add=TRUE)
-  legend("topright",
-         col=c("blue","red","green"),
-         lty=c(1,1,1),
-         legend=c("lognormal","weibull","gamma"))
-  return(list(weibull=est.weibull,lnorm=est.lnorm,gamma=est.gamma))
-}
-</code>
 オプションのhistをhist=FALSEとすると、ヒストグラムは描かず、密度関数のみを描くようになる。
-またカプラン・マイヤー法で推定した生存関数と共に描くための関数も用意した。
+またカプラン・マイヤー法で推定した生存関数と共に描くための関数compare.KM.and.distributionsも用意した。
+これで、どの分布に最も近いかを検討できる。
 <code>
-compare.KM.and.distributions <- function(fit,xlim=NULL) {
+X.fits <- compare.histgram.and.densities(X)
-  require(survival)
+compare.KM.and.distributions(X.fits,data=X)
-  X.km <- survfit(Surv(time,status)~1,data=X)
-  if(!is.null(xlim)) {
-    plot(X.km,
-         xlab="Time",
-         ylab="Survival Function",xlim=xlim)
-    plot(function(x) pweibull(x,shape=X.fits$weibull$estimate[1],scale=X.fits$weibull$estimate[2],
-                              lower.tail=FALSE),
-         xlim=xlim,add=TRUE,col="green")
-    plot(function(x) plnorm(x,meanlog=X.fits$lnorm$estimate[1],sdlog=X.fits$lnorm$estimate[2],
-                            lower.tail=FALSE),
-         xlim=xlim,add=TRUE,col="blue")
-    plot(function(x) pgamma(x,shape=X.fits$gamma$estimate[1],rate=X.fits$gamma$estimate[2],
-                            lower.tail=FALSE),
-         xlim=xlim,add=TRUE,col="orange")
-  } else {
-    plot(X.km,
-         xlab="Time",
-         ylab="Survival Function",xlim=xlim)
-    plot(function(x) pweibull(x,shape=X.fits$weibull$estimate[1],scale=X.fits$weibull$estimate[2],
-                              lower.tail=FALSE),
-         xlim=c(0,max(X.km$time)),add=TRUE,col="green")
-    plot(function(x) plnorm(x,meanlog=X.fits$lnorm$estimate[1],sdlog=X.fits$lnorm$estimate[2],
-                            lower.tail=FALSE),
-         xlim=c(0,max(X.km$time)),add=TRUE,col="blue")
-    plot(function(x) pgamma(x,shape=X.fits$gamma$estimate[1],rate=X.fits$gamma$estimate[2],
-                            lower.tail=FALSE),
-         xlim=c(0,max(X.km$time)),add=TRUE,col="orange")
-  }
-  legend("topright",lty=c(1,2,1,1,1),col=c("black","black","green","blue","orange"),
-         legend=c("Estimate","95% Confidence Interval",
-                  paste("Weibull",round(X.fits$weibull$loglik,5)),
-                  paste("Lognormal",round(X.fits$lnorm$loglik,5)),
-                  paste("Gamma",round(X.fits$gamma$loglik,5))),
-         cex=0.5)
-}
 </code>
-これで、どの分布に最も近いかを検討できる。
 == 一部が打ち切られているデータ ==
@@ 行 330: / 行 277: @@
 </code>
+関数survregは、ワイブル分布weibull、対数正規分布lognormal、ロジスティック分布logistic、対数ロジスティック分布loglogisticの４種類の確率分布の最尤推定法を提供する。ガンマ分布には対応していない。
+そのため以下では、ワイブル分布と対数正規分布のみを対象として、話を進める。
+<code>
+X.survreg.weibull <- survreg(Surv(time, status)~1, data=X, dist="weibull")
+names(X.survreg.weibull)
+c(1/X.survreg.weibull$scale,exp(X.survreg.weibull$coefficients))
+summary(X.survreg.weibull)
+X.fit.weibull$loglik
+</code>
+<code>
+X.survreg.lognormal <- survreg(Surv(time, status)~1, data=X, dist="lognormal")
+names(X.survreg.lognormal)
+c(X.survreg.lognormal$coefficients, X.survreg.lognormal$scale)
+summary(X.survreg.lognormal)
+X.fit.lognormal$loglik
+</code>
+これらをカプラン・マイヤー曲線に重ねて描く関数compare.KM.and.distributions.2 も用意した。
+<code>
+compare.KM.and.distributions.2(X)
+</code>

Applied Statistics and Data Mining

サイト用ツール

ユーザ用ツール

差分

サイト用ツール

ページ用ツール

ユーザ用ツール

差分